UndoManager and DOM Transaction

Working Draft — 30 August 2011

Editor:
Ryosuke Niwa <rniwa@webkit.org>
Acknowledgements
Anne van Kesteren, Annie Sullivan, Alex Russell, Ehsan Akhgari, Eric Uhrhane, Frederico Caldeira Knabben, Ian Hickson, Johan "Spocke" Sörlin, Jonas Sicking, Ojan Vafai
Latest version:
http://rniwa.com/editing/undomanager.html
Previous versions:
http://rniwa.com/editing/undomanager-2011-08-09.html
http://rniwa.com/editing/undomanager-2011-08-08.html
http://rniwa.com/editing/undomanager-2011-07-26.html
Use cases:
http://rniwa.com/editing/undomanager-usecases.html

Status

This document is an early proposal of the specification for Undo Manager and DOM transaction. I hope this specification will eventually be merged into the main HTML specification and replace the UndoManager section after proposing it on whatwg.

Table of Contents

  1. 1 Introduction
  2. 2 Undo Scope and Undo Manager
    1. 2.1 Definitions
    2. 2.2 Undo scope
    3. 2.3 The UndoManager interface
    4. 2.4 Undo: moving back in the undo transaction history
    5. 2.5 Redo: moving forward in the undo transaction history
  3. 3 Transaction and DOM changes
    1. 3.1 The Transaction interface
    2. 3.2 Managed transactions
    3. 3.3 Manual transactions
    4. 3.4 Mutations of DOM
  4. 4 Transaction Event
    1. 4.1 The TransactionEvent interface
    2. 4.2 The Mutation interface
    3. 4.3 Transaction, undo, and redo events
  5. 5 Edit action event

1 Introduction

This specification defines the API to manage user agent's undo transaction history (also known as undo stack) and make objects that can be managed by the undo transaction history.

Many rich text editors on the Web adds editing operations that are not natively supported by execCommand and other Web API. For example, many editors make modifications to DOM after the user agent executed user editing actions to work-around user agent bugs and to customize for their use.

However, doing so breaks user agent's native undo and redo because the user agent cannot undo DOM modifications made by scripts. This forces the editors to re-implement undo and redo entirely from scratch, and mMany editors, indeed, store innerHTML as string and recreate the entire editable region whenever a user tires to undo and redo. This is very inefficient and has limited the depth of their undo stack.

Also, any Web app that tries to mix contenteditable region or text fields with canvas or other non-text editable regions will have to reimplement all undo and redo because the user agent typically has one undo transaction history per document, and there is no easy way to add new undo entry to the user agent's native undo transaction history.

2 Undo Scope and Undo Manager

2.1 Definitions

The user agent must associate an undo transaction history, a list of transaction and transaction group entries, with each UndoManager object.

The undo transaction history has an undo position. This is the position between two entries in the undo transaction history's list where the next entry represents what needs to happen when undo is done, and the previous entry represents what needs to happen when redo is done.

The undo scope is the collection of DOM nodes that are managed by the same UndoManager. A document node or an element with undoscope attribute defines a new undo scope, and all descendent nodes of the element, excluding elements with and descendent nodes of elements with undoscope attribute, will be managed by a new UndoManager. An undo scope host is an element with with undoscope attribute or the document.

2.2 Undo scope

The undoscope attribute is a boolean attribute that controls the default undo scope of an element.

When the undoscope content attribute is added, the user agent must define new undo scope for the element, and the user agent must create a new UndoManager to manage any DOM changes made to all descendent nodes of the element excluding ones that have the undoscope content attribute and their descendent nodes.

When the undoscope content attribute is removed, the user agent must delete all entries in the undo transaction history of the corresponding undo scope and destroy the corresponding UndoManager for the scope. The node from which the content attribute is removed and their descendent nodes, excluding elements with and descendent nodes of elements with the undoscope content attribute, then belong to the undo scope of the closest ancestor with the undoscope content attribute or of the document.

The undoScope IDL attribute must reflect the undoscope content attribute.

contenteditable content attribute does not define a new undo scope and all editing hosts share the same UndoManager by default. However, contenteditable content attribute affects whether a sequence of transactions can be a proper sequence of managed transactions.

Undo scope is needed to separate undo transaction histories of multiple editable regions without scripts. Using undoscope content attribute, authors can easily set text fields in a widget to have a separate undo transaction histories for example.

We need to specify what happens to the existing transactions that cross management boundary when a undoscope attribute is added or removed.

2.3 The UndoManager interface

To manage transaction entries in the undo transaction history, the UndoManager interface can be used:

interface UndoManager {
    void transact(in Object transaction, in boolean merge);
    void undo();
    void redo();
    getter Transaction item(in unsigned long index);
    readonly attribute unsigned long length;
    readonly attribute unsigned long position;
    void clearUndo();
    void clearRedo();
};
document . undoManager

Returns the UndoManager object.

element . undoManager

Returns the UndoManager object.

undoManager . transact(transaction, merge)

Clears entries above the current undo position, and applies transaction, and pushes it to the undo transaction history. It also forms a transaction group if merge is set to true.

undoManager . undo()

Unapplies the transaction or all transactions in the transaction group immediately after the current position and increments position by 1 if position < length.

undoManager . redo()

Reapplies the transaction or all transactions in the transaction group immediately before the current position and decrements position by 1 if position > 0

undoManager . position

Returns the number of the current entry in the undo transaction history. (Entries at and past this point are redo entries.)

undoManager . length

Returns the number of entries in the undo transaction history.

data = undoManager . item(index)
undoManager[index]

Returns the entry with index index in the undo transaction history.

Returns null if index is out of range.

undoManager . clearUndo()

Removes entries in the undo transaction history before position and resets position to 0.

undoManager . clearRedo()

Removes entries in the undo transaction history after position.

The undoManager IDL attribute of Node interface must return the object implementing the UndoManager interface for the undo scope if the node is an undo scope host. If the node is not an undo scope host, it must return null.

UndoManager objects represent and manage their node's undo transaction history.

The object's supported property indices are the numbers in the range zero to length-1, unless the length is zero, in which case there are no supported property indices.

The transact(transaction, merge) will

  1. Clear all transactions or transaction groups between before the current undo position.
  2. Set the transaction's host property to the associated undo scope host if it's null. Otherwise throw an exception. Specify which exception it is.
  3. Apply the transaction.
  4. If merge is not set to true, add transaction to the top of undo transaction history and end.
  5. Otherwise, if the first entry in the undo transaction history was a transaction group, then add transaction to the top of the array that forms the transaction group and end.
  6. Otherwise, create a new array and insert the new transaction and the first entry in undo transaction history to the array in the respective order to form a new transaction group.
  7. Replace the the first entry in undo transaction history by the new transaction group.

The undo() unapplies the transaction immediately after the undo position and moves the undo position forward (position is incremented by 1) if position < length. Otherwise, it must do nothing.

The redo() reapplies the transaction immediately before the undo position and moves the undo position backward (position is incremented by 1) if position > 0. Otherwise, it must do nothing.

The item(n) method must return the nth transaction's associated data in the undo transaction history, if there is one, or null otherwise.

Being able to access an arbitrary element in the undo transaction history is needed to allow scripts to determine whether new transaction and the last transaction should form a transaction group.

The position attribute must return the index of the undo position in the undo transaction history. If there are no transactions to redo, then the value must be same as length attribute. If there are no transactions to do, then the value must be zero.

The length attribute must return the number of entries in the undo transaction history. This is the length.

The clearUndo() method must remove all entries in the undo transaction history before the undo position. It also moves the undo position to the top (position is set to zero).

The clearRedo() method must remove all entries in the undo transaction history after the undo position.

The active undo manager is the UndoManager of the focused node in the document. If no node has focus, then it's assumed to be of the document.

A transaction group is an array of consecutive transactions that belong to the same UndoManager such that all transactions in the array are unapplied and reapplied together in one undo or redo. A transaction group is formed when a transaction is applied via UndoManager's transact() method with merge set to true and the UndoManager already has a transaction group or a transaction.

A typical use case for a transaction group is typing where insertions of multiple letters, spaces, and new lines can be undone or redone in one step.

When transactions in a transaction group are unapplied, the user agent must unapply each transaction in the array that forms the transaction group in the ascending order from the first entry to the last entry. When transactions in a transaction group are reapplied, the user agent must reapply each transaction in the array in descending order from the last entry to the first entry.

In the following example, letters "o" and "k" are inserted by two managed transactions that form one transaction group. A br element and string "hi" are then inserted by other two managed transactions to form another transaction group. All transactions have the label "Typing".

// Assume myEditor is some element that has undoscope attribute, and insert(node) is a function that inserts the specified node at where the caret is.
myEditor.undoManager.transact({apply: function () { insert(document.createTextNode('o')) }, label: 'Typing'});
myEditor.undoManager.transact({apply: function () { insert(document.createTextNode('k')) }, label: 'Typing'}, true);
myEditor.undoManager.transact({apply: function () { insert(document.createElement('br')) }, label: 'Typing'});
myEditor.undoManager.transact({apply: function () { insert(document.createTextNode('hi')) }, label: 'Typing'}), true);

When the first undo is executed immediately after this code is ran, the last two transactions are unapplied, and the br element and string "hi" will be removed from the DOM. The second undo will unapply the first two transactions and remove "o" and "k".

Because Mac OS X and other frameworks expect applications to provide an array of undo items, simply dispatching undo and redo events and having scripts manage undo transaction history would not let the user agent populate the native UI properly.

2.4 Undo: moving back in the undo transaction history

When the user invokes an undo operation, or when the execCommand() method is called with the undo command, the user agent must perform an undo operation on the active undo manager.

If the undo position is at the end of the undo transaction history (position is equal to length), then the user agent must do nothing.

Otherwise, the user agent must unapply the transaction immediately after the undo position and move the undo position forward (increment position by 1).

2.5 Redo: moving forward in the undo transaction history

When the user invokes a redo operation, or when the execCommand() method is called with the redo command, the user agent must perform an redo operation on the active undo manager.

If the undo position is at the beginning of the undo transaction history (position is zero), then the user agent must do nothing.

Otherwise, the user agent must reapply the transaction immediately before the undo position and move the undo position backward (decrement position by 1).

3 Transaction and DOM changes

A transaction is an ordered set of DOM changes associated with a unique undo scope host that can be applied, unapplied, or reapplied.

To apply a transaction means to make the associated DOM changes under the associated undo scope host. And to unapply and to reapply a transaction means, respectively, to revert and to remake the associated DOM changes under the associated undo scope host.

A transaction can be unapplied or reapplied if it appears, respectively, immediately after or immediately before the undo position in the associated UndoManager's undo transaction history.

3.1 The Transaction interface

[NoInterfaceObject]
interface Transaction {
    attribute DOMString label;
    readonly attribute Node host;
    attribute Function? apply;
    attribute Function? unapply;
    attribute Function? reapply;
};

The Transaction interface is to be implemented by content scripts that implement a transaction.

label attribute must return the value that describes the semantics of the transaction such as "Inserting text" or "Deleting selection". The user agent may expose this string through its native UI such as menu bar or context menu.

host attribute must return the undo scope host under which the transaction is applied. When a undo manager about to applies the transaction, the undo manager must set the host attribute to the corresponding undo scope host.

apply, unapply, and reapply are attributes that must be supported, as IDL attributes, by objects implementing the Transaction interface.

3.2 Managed transactions

A managed transaction is a transaction where DOM changes is tracked by the user agent and the logic to unapply or reapply the transaction is implicitly created by the user agent.

When the apply attribute of an object that implements Transaction interface returns a function, and the unapply attribute and the reapply attribute return both return null or undefined, the object implements a managed transaction.

When a managed transaction is applied by calling UndoManager's transact() method, the function returned by apply attribute is invoked. All DOM changes made by the method must be tracked by the user agent.

TODO: Need to specify what happens when apply returned a non-function object. Probably throw an exception.
TODO: We need to restrict what apply function can do. Particularly with regards to changing undoscope attribute or undoScope property. We should probably forbid modifying undo scope in a transaction.

When a managed transaction is unapplied or reapplied, the user agent must revert DOM changes made during its application provided all DOM changes made prior to unapplying or reapplying the transaction and after the transaction was applied were made by a proper sequence of managed transactions.

TODO: Need to restore selection as well.

A proper sequence of managed transactions under an UndoManager is a sequence of applying, unapplying, or reapplying managed transactions done through the UndoManager, any changes to states other than DOM states such as CSS rules, selection, script objects, and any DOM changes to nodes that are neither ancestors or descendants of the lowest common ancestor of nodes mutated by the managed transactions or the highest editing host under which user editing actions are taken.

For example, managed transactions applied, unapplied, or reapplied inside a contenteditable div and managed transactions were applied, unapplied, or reapplied inside a canvas that is a sibling of the editable div constitute a proper sequence of managed transactions. On the other hand, if any manual transactions were to mutate nodes inside the editing host inside which user editing actions were taken by the browser, then they do not constitute a proper sequence of managed transactions.

TODO: This part of the spec is a horrible mess. We need to clarify the details.

Because only UndoManager can apply, unapply, or reapply transactions, any proper sequence of managed transactions satisfy the following conditions:

Because of these conditions, unapplying and reapplying transactions can be thought of as transformations and inverse-transformations on the undo scope host and its descendants. In fact, the user agent must not store the entire DOM state before and after each transaction to implement managed transactions.

The user agent must implement user editing actions as managed transactions, and any application defined managed transactions must be compatible with user editing actions.

TODO: Need to specify what happens if event listeners on DOM mutation events attempts to modify DOM during apply, unapply, or reapply.

3.3 Manual transactions

A manual transaction is a transaction where the logic to apply, unapply, or reapply the transaction is explicitly defined by an application.

When the unapply attribute or the reapply attribute of an object that implements Transaction interface returns a function, the object implements a manual transaction.

To create a manual transaction, the ManualTransaction interface can be used:

When a manual transaction is applied by calling UndoManager's transact() method, the function returned by the apply attribute is invoked if the attribute returns a valid function object.

When a manual transaction is unapplied, the function returned by the unapply attribute is invoked if the attribute returns a valid function object.

When a manual transaction is reapplied, the function returned by the reapply attribute is invoked if the attribute returns a valid function object. If the reapply attribute returned null or undefined, then the function returned by apply attribute is invoked instead.

Because manual transaction cannot form a sequence of managed transactions, it can be incompatible with user editing actions. In fact, the user agent can fail to unapply or reapply transactions created by user editing actions once the associated UndoManager applies any manual transaction.

The main rationale behind having a manual transaction is to let websites add undo item into user agent's native UI.

The rationale behind having reapply attribute in addition to apply is to let the application do an extra work in reapply such as restoration of selection.

3.4 Mutations of DOM

DOM changes of an element are any changes to the element and its descendent nodes that may trigger DOM level 3's DOM mutation events. This includes but not limited to the following mutations:

HTML5 spec says that internal state changes should be considered as DOM changes but in practice, it's infeasible to store all document states.

The DOM state of an element is the state of all descendent elements and their attributes that are transient under DOM changes of the element. Need a better definition

4 Transaction Event

4.1 The TransactionEvent interface

TODO: Write a spec for TransactionEvent interface. This should address the following use case:

4.2 The Mutation interface

TODO: write a spec for Mutation interface.

4.3 Transaction, undo, and redo events

When a browser or script creates a new transaction, the browser must create TransactionEvent object and dispatch the newly created TransactionEvent object at the element associated with the transaction.

TODO: write a spec for transaction event. It should provide a list of DOM mutations and high-level semantics.

5 Edit action event

I'm not sure execommand/editaction event belongs to this spec. Maybe it's better left in the html5 or web app spec? Should it be spec-ed with key-binding mechanism?