The documentation is scarce for now. Hopefully the source code is easy to read and mostly explains itself. In any case, this document describes pretty well what's going on. The rest are implementation details and can be found in the source code.
Ymacs development docs
Ymacs is a DynarchLIB widget. As such, it can be included in any widget container. The demo embeds it in a DlLayout, which also displays a sample menu bar. The layout in turn is embedded in a DlDialog which shows up in maximized state. As you can see from the demo, it's quite easy to add a menu, or a toolbar; I don't think the basic Ymacs widget should provide these.
DEFINE_CLASS
Ymacs is written in what I began calling “DynarchLIB new style”. I wrote a PDF that explains what's going on with the DEFINE_CLASS statements. You should read it if you plan on hacking Ymacs, but in short here's what happens: as you might know, JavaScript doesn't provide very nice syntax for defining objects. You have to do all that weird stuff with the prototype and have to instantiate an object of the base class type in order to setup inheritance. Kind of ugly.
The DEFINE_CLASS “macro” provides a more convenient syntax and it's deeply integrated into DynarchLIB. It receives 3 arguments: the name of the new class (a string), a base class constructor (or null if there's no base class) and a function. The function (in bold next, so you know that I'm referring to it) is called immediately given 3 arguments which I'm calling D, P and DOM. These args are convenient shortcuts to frequently used objects during the definition of a class. D is a reference to the new class constructor, which already exists when the function is called. P is a reference to the new class' prototype. And DOM is an alias to DynarchDomUtils, to save some typing.
The function that you supply to DEFINE_CLASS should construct the class' methods and properties. Create a method by inserting it into P, or a “static” method by inserting it into D. There are some “magic” keywords that you can add to these, such as D.CONSTRUCT—the constructor, or D.DEFAULT_ARGS—a specification of the arguments that the new object can receive in the constructor, along with their default values. Well, for more information you should really read my article (the implementation in DynarchLIB is a bit different, but not in essential ways).
Objects
Ymacs is centered around the following essential objects:
Ymacs_Buffer — that's where everything started; initially all code was written in a single object.
Currently a buffer inherits from DlEventProxy and it represents, well, a buffer. The text is split into lines which are saved in an array. The object provides sufficient operations to access and modify the text, and handles key press events forwarded by the active frame. A buffer also maintains a set of markers (see below), two of which are essential: the current cursor position, and the “mark” position. In Emacs, the “region” is defined by these two markers. You set the mark to one place, then move the cursor to define the region. You can swap the cursor and the mark using C-x C-x.
Ymacs_Marker — a marker (another idea stolen from Emacs) is a simple object that holds an integer—a position within a buffer. It doesn't inherit from anything, currently, though in the beginnings it was a DlEventProxy.
Each marker is attached to a buffer. When the buffer text is modified, the attached markers receive an event which tells them the position where the buffer is modified, and an offset—the amount of characters that were inserted or removed (newlines are counted!). If the modified position is smaller than or equal to the position that the marker had saved, then the marker's position is updated by the given offset. This way you can maintain “stable positions” through a text, even while the text is modified.
The “changed” event that affects a marker also sends a “minimum allowed position”, which tells the marker the minimum location it should be set to, in the event position + offset becomes smaller. This is useful, for example, when you store a marker at position 5 and remove the text starting with position 2 through to 10. The marker would receive position: 2, offset: -8, which would make it calculate the new location as 5 - 8, a negative value (not allowed). The minimum, in this case, would be 2, so the marker won't fall below this position, which intuitively seems to be what we want.
When a marker's position has changed, an onChange event is triggered which the editor can catch back. This is used, for example, to draw the caret at the right location.
When a marker is no longer needed, it should be destroy()-ed, so that the buffer won't keep updating it pointlessly for each text operation.
Markers are useful for a lot of operations. For example imagine text filling, which breaks paragraphs into lines. Text filling is implemented in a “Logo-style programming”, which turns out to be very convenient. It moves the caret to the start of the paragraph and first removes any existing newlines, so that the paragraph becomes one big line of text. Then, again starting at the beginning of the paragraph, it moves to the next whitespace character and checks the column. If it's bigger than “fill_column”, it moves back one word and inserts a newline. Then it continues until the end of the paragraph.
All this is done in a “save_excursion” closure. The “save_excursion" command creates a marker at the current cursor position (which shall be named point location), then executes the given function. After the function finishes execution, the point is restored at the position of the saved marker. Remember, since the marker is updated as the text is inserted or removed, at the end of the execution the cursor will be at the same “stable” location as it was previously, so if you press M-q to fill a paragraph and your cursor is in front of some word, you can be sure that it will be in front of the same word after the text is rearranged.
Turns out, Logo-style programming is very convenient for a lot of text editing algorithms, even if a bit slow. Take a look in ymacs-commands.js and search for “fill_paragraph” to see the implementation of text filling (it's a bit more complicated than I described because I wanted it to understand certain paragraph prefixes).
Ymacs_Frame — an widget that inherits from DlContainer which is responsible for drawing a buffer and the cursor, and intercepting keyboard commands and forwarding them to the buffer. When a buffer is attached to a frame, the frame immediately displays the contents, moves the cursor to position zero, and attaches some event listeners to the buffer object, so that it can update the screen when the buffer is modified. The previous buffer, if any, is completely forgotten—event listeners removed etc.
Ymacs_Keymap — a keymap is an object that holds a set of key strings and their associated functions. It provides some helper functions for defining new keys, or “parsing” key events to transform them into key definition strings. In essence you shouldn't have much to deal with it, because it makes defining new keymaps so easy that you don't need to know the internals.
Take a look at Ymacs_Keymap_Emacs to see how the default keymap is defined.
Key definition strings are similar to those used in standard Emacs: some letters combined with the dash sign specify the modifiers, and the last character is the key itself. Some control keys must be used by long name. I'm pasting them below from DlKeyboard:
BACKSPACE : 8, TAB : 9, ENTER : 13, ESCAPE : 27, SPACE : 32, DASH : 45, PAGE_UP : 33, PAGE_DOWN : 34, END : 35, HOME : 36, ARROW_LEFT : 37, ARROW_UP : 38, ARROW_RIGHT : 39, ARROW_DOWN : 40, INSERT : 45, DELETE : 46, F1 : 112, F2 : 113, F3 : 114, F4 : 115, F5 : 116, F6 : 117, F7 : 118, F8 : 119, F9 : 120, F10 : 121, F11 : 122, F12 : 123,The numbers are character codes, but you don't normally have to use them. So, for instance, if you want to assign some command to Control-Alt-5, you would use the string “C-M-5”, or “M-C-5” (the order of the modifiers is not important, since they are normalized in Ymacs_Keymap). Or, if you want Control-Shift-PageUp, you would use “S-C-PAGE_UP”.
“M” stands for “Meta” and it's historically used in Emacs since early days; in Ymacs it really stands for the ALT key.
A buffer holds an array of keymaps. Initially there's only the default keymap, but more can be added for specific operations. For example isearch mode pushes its own keymap into this array.
When a keypress event is received, the buffer walks the keymaps array in reverse order and tries the event on each keymap. When one keymap handled the event, it stops.
Keymaps are not yet tested on MacOSX. Some things might not work out-of-the-box because the Mac doesn't have an ALT key and I'm not sure yet what different modifier should I check instead.
Ymacs_Tokenizer — a tokenizer object is attached to a buffer and does some syntactic analysis of the text. If you look into ymacs-tokenizer.js, you'll see it actually starts with an Ymacs_Stream object, which provides a light, line-based way to traverse a buffer. This was added as an afterthought, when I realized that walking the buffer with Ymacs_Buffer commands is too slow for parsing.
A tokenizer starts looking at the text and emits “onTokenFound” events whenever it encounters a significant piece of text. This event receives the line number, the starting column and ending column of the token. The buffer listens to these events and uses text properties to highlight the token. Note that such a token cannot span across multiple lines.
The actual parsing is done by an object which is specialized on the particular programming language of the current mode. I don't even know how to call this object (let's call it a parser object though)—there's no real class defined for it, we just use functions according to a method described by Marijn Haverbeke in his story about CodeMirror. In essence, this object provides two methods: next() and copy(). Although it's similar to CodeMirror's, my implementation starts to diverge here. next() doesn't return a token—in fact the return value is ignored. It should just call the "onTokenFound" event on the tokenizer object whenever it finds a token, and can throw two recognized exceptions to stop the parsing: Ymacs_Stream.EOL at end of line, and Ymacs_Stream.EOF at end of file. next() is free to discover more than one token, should it want to, but should be careful about throwing the given exceptions when it is the case.
When an EOL exception is caught in the tokenizer, copy() is called on the parser object. It should return a function that is able to resume the current state of the parser object anytime in the future, and return it. It can return the same parser object, or a different one—this doesn't really matter; what matters is that the returned object knows the current state and is able to continue parsing from there. There's some amount of brain twisting to understand what's going on (no way if you don't understand closures, by the way, but my article explains them very well I hope; read it, now).
The most complicated part of the parser object is figuring out multi-line constructs. Since it has to throw EOL at the end of each line, it must interrupt work and be able to resume later. For example, if it was in a multi-line comment when EOL occurs, it has to know that it was parsing a comment when it is resumed. In the implementation of the two parsers that I wrote so far (JavaScript and XML) I'm using a $cont array that contains functions that should be called to continue parsing. When a multi-line construct is started, the function that parses it is pushed to $cont and then called. It should call $cont.pop() only when its construct is finished. next() will first look into the $cont array, and if any special continuation is found there, it will be called instead of doing normal parsing.
One other nasty aspect is that the text isn't completely redisplayed during syntax highlighting. For example if you type "function", it gets colored as a keyword. However if you then delete the "n", the tokenizer will start again but must emit onTokenFound events for the "functio" characters as well, otherwise they will remain colored as keywords. In this case, onTokenFound will receive a null token type, and Ymacs_Buffer will know that it has to remove the text properties for that area.
This is the playground. The objects above lay at the heart of Ymacs. There is one more object—the Ymacs widget itself, which is just a small glue that binds these together. The Ymacs widget creates a minibuffer, which is really an Ymacs_Buffer and has a frame in its own right, and a mode line which displays some information about the currently active buffer. The mode line should be refactored—in essence, it should be removed from the Ymacs widget and added to Ymacs_Frame, since it's useful (and standard in Emacs) to have a mode line for each frame. Only the minibuffer frame, of course, should not display a mode line.
The Ymacs widget also maintains a “kill ring”, which is the basis of Ymacs' clipboard. The kill ring is shared across all buffers in an Ymacs widget, to make it easy to copy from one buffer and paste (“yank”) into another. Originally I tried to keep these objects self-contained and independent, so that we could have, for instance, a buffer and a frame without using the Ymacs widget; but I think this no longer works. Could be revived if there's enough interest for it.