class is one of the most important classes in WebKit’s editing component.
Position objects are used in editing, accessibility, hit testing,
and other components to represent a point in DOM.
To illustrate this idea, consider the following markup:
<div>This is <b>bolded</b></div>
When this markup is parsed by WebKit, a DOM tree like this will be generated (partial, of course, because HTML5 mandates that the document and the body nodes should automatically be generated):
div #text("This is ") b #text("bolded")
#text(X) represents a text node with text
On many user agents (e.g. web browsers),
users can place and manipulate the caret (a.k.a. insertion point)
and selection around text by mouse or keyboard.
Position objects represent end points of caret and selection.
We have the
that can represent an arbitrary caret or selection using two Position objects
that controls the selection of each frame.
For example, if you select the text
VisibleSelection object that represents this selection has
Position object representing a point before
Position object representing a point after
(containerNode, offset) Pair
Now, how does one represent a point in DOM?
One way to do this is by using (containerNode, offset) pair as does
the Range interface in DOM.
In this model, a position (containerNode=X, offset=N) is between the children
at offsets N and N+1 of the DOM node X.
e.g. (div, 0) represents the position immediately before
and (div, 1) represents the position immediately after
and immediately before the b element.
While (containerNode, offset) pair representation is the one exposed to the Web
it poses a bit of challenge when used in editing.
Consider the following DOM now:
div #text("Here is an image:") br br img
And suppose the caret is after the two br elements and before the img element.
This position can be represented as (div, 3).
Now further suppose that the user decides to remove the two br elements between
#text("Here is an image:") and the img element to get:
#text("Here is an image:")
Here, the position (div, 3) is no longer valid because the div element has only 2 children. To this fix problem, DOM level 2 specifies how Range should be updated upon DOM mutation, resulting in performance where m is the number of Range objects attached in the same document.
In WebKit, we tried to work-around this issue by introducing editing offsets. In the context of editing offsets, we allow offset 1 with nodes that do not have any children (more precisely, editing ignores content) to represent the positions after the nodes. It also treat offset 0 to represent the positions before the nodes. For example, in the above example, we can use the position (img, 0) to mean the position before the img element and (img, 1) to mean the position after the img element. Under this representation, the positions will remain valid under insertions and deletion of siblings of the img element.
One subtle change we had to make to (containerNode, offset) pair representation is that
containerNode was no longer a container node because depending on the type of the node,
this position can reside before or after the node.
Thus we started calling it an anchor node instead and the
Position became a pair (anchorNode, offset).
Unfortunately, the notion of editing offsets really confused the rest of WebKit
and some code in editing component itself because the meaning of offsets changed depending on the node.
For example, (X, 0) could mean either that the position is before the node X or that
the position is first position inside X depending on
whether X’s content is ignored by editing or not (i.e.
So WebKit gurus eseidel
amongst others came up with the idea of anchor type.
Initially, the anchor type of a position was an enum with 3 values:
PositionIsOffsetInAnchor corresponds to a (node, offset) pair that DOM’s
Range interface implements
PositionIsAfterAnchor correspond to
before-node and after-node positions introduced by editing offsets.
So the the position became a 3-tuple (anchorNode, anchorType, offset), and we’ve added new constructors for Position that explicitly takes an anchor type, and made other constructors auto-compute the anchor type for legacy editing-offset-based positions.
This is about the time when I joined the WebKit team
and realized something strange.
I found that a lot of editing code called
which returns the anchor node of a position, without checking the anchor type of the position.
Consider the markup
<div>a<p>b</p></div> and its DOM tree:
div #text("a") p #text("b")
Now, if we have a position between
#text("a") and p,
we can represent it as (p, PositionIsBeforeAnchor, 0) where the last value is ignored.
And take a look at the following WebKit code that
Node* enclosingBlockNode = enclosingBlock(m_selection.extent().node());
m_selection.extent() is a
enclosingBlock is a function
that returns the closest block ancestor of the specified node
enclosingBlock walks the DOM tree upwards from the given node
and returns the first block node it encounters).
m_selection.extent() was (div, PositionIsOffsetAnchor, 0),
enclosingBlockNode will be the
and it’ll be the
p element if
m_selection.extent() was (#text(“b”), PositionIsBeforeAnchor, 0).
A problem surfaces when we let
m_selection.extent() be (p, PositionIsBeforeAnchor, 0)
In this case, the position resides between #text(“a”) and the
even though the anchor node is the
node() returns the
enclosingBlock will return the
p element itself.
This is clearly wrong because the position resides outside the
This happens because the code assumed that a position always resides inside its anchor node,
which isn’t the case for
PositionIsAfterAnchor anchor types.
Here’s another common mistake:
insertionPos is a position and
newNode is some node to be inserted.
insertNodeBefore is a function that inserts the first node
into the parent of the second node immediately before the second node.
This code makes a lot of sense if the anchor type of
because then we’ll be inserting
newNode right before the
Unfortunately, the code makes little sense when the anchor type of
in which case
newNode is inserted one element ahead of the
It makes even less sense when the anchor type is
PositionIsOffsetInAnchor, in which case,
newNode is inserted before the container node of insertionPos.
After realizing this very fundamental issue present in hundreds of lines of code, I declared Position::node obsolete and added deprecated in its name.
More Anchor Types for Efficiency
During the 2011 WebKit Contributors' Meeting, we developed further discussion on how to refactor and improve Position and related classes. One of the ideas that came up during the meeting was to restrict the use of offsets to text nodes to allow fast comparison between positions. To see where we’re coming from, consider two positions (p, PositionIsBeforeAnchor, 0) and (div, PositionIsOffsetInAnchor, 1) in the following DOM tree:
div #text("a") p #text("b")
They represent the same position between
#text("a") and the
However, bit-wise comparison of the position fails because their internal representation is different.
To make things worse, WebKit stores the children of a node as a liked list.
This would mean that to compare the above two positions, we must be traversing a linked list,
yielding performance where n is the number of nodes in the DOM.
This had also caused problems in transitioning from legacy editing positions to new explicitly typed positions
because a lot of position comparison didn’t take these differences into account.
To mitigate this problem, we decided to introduce two more anchor types:
and restrict the use of
PositionIsOffsetInAnchor to only text nodes.
We’ll replace (X, PositionIsOffsetInAnchor, N) by (X.childNodes[N], PositionIsBeforeAnchor, N)
when N is less than the number of children and by (X, PositionIsAfterChildren, N)
when N is equal to the number of children when X is a non-text node;
or by (X.childNodes[N-1], PositionIsAfterAnchor, N) when N > 0
and (X, PositionIsBeforeChildren, 0) when N = 0.
To illustrate how this works, let me enumerate all possible positions inside the
- (div, PositionIsBeforeChildren) = (#text(“a”), PositionIsBeforeAnchor)
- (#text(“a”), PositionIsOffsetInAnchor, 0)
- (#text(“a”), PositionIsOffsetInAnchor, 1)
- (#text(“a”), PositionIsAfterAnchor) = (p, PositionIsBeforeAnchor)
- (p, PositionIsBeforeChilden) = (#text(“b”), PositionIsBeforeAnchor)
- (#text(“b”), PositionIsOffsetInAnchor, 0)
- (#text(“b”), PositionIsOffsetInAnchor, 1)
- (#text(“b”), PositionIsAfterAnchor) = (p, PositionIsAfterChildren)
- (p, PositionIsAfterAnchor) = (div, PositionIsAfterChildren)
Note I’ve eliminated legacy editing positions and
PositionIsOffsetInAnchor positions for non-text nodes.
Because each position has at most two possible representations
and the anchor node of one representation is always the first child
or the last child of the anchor node of another representation,
comparison between these positions can be done in .
Furthermore, the set of positions that use only
will uniquely represent each position in the DOM, making them isomorphic to the set of positions defined
in the DOM level 2 Traversal and Range specification.
- WebKit uses
Positionclass to store selection end points.
- There are 5 anchor types:
- In addition, there’s legacy editing positions that use the editing offsets. In editing offsets, 0 means before-anchor and 1 means after-anchor.
anchorNode()is usually wrong for some types of positions.
- We don’t want (node, offset, PositionIsOffsetInAnchor) pair for non-text nodes.