Mouse Selection & Highlighting

Design & Implementation of a Win32 Text Editor

Mouse input has proven to be the most intricate and difficult to write part of Neatpad to date. It hasn't been helped by the fact that Neatpad now supports variable-width fonts, so in some ways I am still unsure if this extra complexity is a good thing or not (from a tutorial / learning point of view). However, if I had stuck with fixed-width fonts it would be quite a task for anyone to move from that limited capability to a fully variable-width display, so from that perspective I think I made the right decision...

Most of the complexity has been caused by the deliberate separation between the TextView and TextDocument classes. If the TextView had direct memory-access to the underlying file some of the code could be a little simpler. However if we are to move to a 4Gb file-editor we must hide the memory-management behind a TextDocument class and force a strict interface between GUI and file-management, accessing the file content in small chunks at a time. It makes our code cleaner, but a little more difficult. Anyway, without further ado let's look at what happen's when a person clicks the mouse-pointer inside our Neatpad window.

Carets, Focus and Activation

In Windows, whenever the mouse is clicked inside a control, the expected behaviour is for the window to receive the input-focus and display some form of graphical feedback which indicates this change in focus. This is a consistent user-interface detail present throughout all of Windows.

The default behaviour in Windows is to send a WM_MOUSEACTIVATE message during this intial window activation. However, at no point does the target window actually receive the input focus due to this mouse-click - this is a detail not known or understood by many people. We must process this WM_MOUSEACTIVATE message manually, and set the focus ourselves:

case WM_MOUSEACTIVATE:
    return ptv->OnMouseActivate();
LONG TextView::OnMouseActivate()
{
    SetFocus(m_hWnd);
    return MA_ACTIVATE;
}

So now whenever the mouse is clicked inside our TextView, it will receive the input focus. More importantly though, the TextView will now receive an additional message - WM_SETFOCUS. It is during the processing of this message that we can create and show a caret.

case WM_SETFOCUS:
    return ptv->OnSetFocus();
LONG TextView::OnSetFocus()
{
    DWORD nWidth = 2;

    SystemParametersInfo(SPI_GETCARETWIDTH, 0, &nWidth, 0);
    CreateCaret(m_hWnd, (HBITMAP)NULL, nWidth, m_nLineHeight);

    ShowCaret(m_hWnd);
    RefreshWindow();
}

The text-caret will be just the same as in Notepad or Visual Studio. We specify a NULL bitmap handle to create a solid blinking caret. It will be exactly 2 pixels wide unless we are on Windows 2000 and above, in which case the SystemParametersInfo request for SPI_GETCARETWIDTH will succeed and we will use that setting instead. The caret will also be the same height as a line of text. This is a pretty standard way to show show a text caret in Windows.

Whenever the TextView loses the input-focus, it will receive the WM_KILLFOCUS message and we can hide the text-caret and delete it:

case WM_KILLFOCUS:
    return ptv->OnKillFocus();
LONG TextView::OnKillFocus()
{
    HideCaret(m_hWnd);
    DestroyCaret();

    RefreshWindow();
}

Note that each time the TextView receives or loses focus, the window entire is redrawn. This will enable us to draw the text-selection in different colours depending on whether the window has focus or not. We now move onto the problem of placing the caret in the correct position whenever the mouse is clicked inside the window. Before we go this far however, we have a decision to make.

TextView Coordinates - offset or coordinate based?

A decision must be made before we go any further. We must decide how to keep track of the current cursor position and selection start/end positions. We have two choices, described below.

The first option is to use a single 32bit file-offset to store all "coordinates". In other words, this is much like a HexEditor or how the existing EDIT control works. The advantage of this method is it is very simple to store, manage and "work with" file offsets - all data accesses are much simpler to perform because everything is offset/buffer based. The disadvantage is that a file-offset does not directly translate to the GUI - remember that we have to use a line buffer to access the underlying file. There is the potential here to introduce extra dependencies on the line-buffer management that may cause performance problems.

The second option is to use a "text-coordinate" system. This would require using two values - one for the line number, and one for the character-offset (or column number) within that line. The advantage of this technique is that it translates directly to what you see on the screen. Moving the caret around with the mouse and keyboard (i.e. user GUI actions) will be potentially a lot easier than the first method because we are using a more natural coordinate system for those actions. The disadvantages are two-fold. Firstly, data-access is now more difficult, and all data-accesses will be required to go through the line-buffer to find the true file-offset - again with the potential performance problems. The second disadvantage is the text-painting. It is far more cumbersome to perform comparisons with an "x,y" coordinate than it is a single value. For example, the decision as to whether to highlight a character based on an x,y coordinate is much more complicated than performing a simple integer comparison.

ULONG  m_nCursorOffset;
ULONG  m_nSelectionStart;
ULONG  m_nSelectionEnd;

I have experimented with method two in the past and to be honest it was just as complicated as the "pure" file-offset method - the complications and performance issues just get moved to a different place. For this reason the TextView will use the simpler "file-offset" method of storing text coordinates. This will hopefully isolate the difficult issues just inside the mouse-input handling and make the rest of our code easier to write.

Placing the Text Caret

When the user clicks the mouse in a "normal" edit control, the text-caret is placed at the beginning of the nearest character to where the mouse is. It seems a kind of obvious thing to do, but this simple operation is going to be the most complex task we have tackled to date.

The first useful message we receive when the user clicks the mouse is WM_LBUTTONDOWN. The handler for this message is shown below and is actually quite simple. I have structured it in such a way that it doesn't matter if we are using offset-based or text-based coordinates.

LONG TextView::OnLButtonDown(UINT nFlags, int mx, int my)
{
    ULONG nLineNo;
    ULONG nCharOff;
    ULONG nFileOff;
    int   xpos;

    // map the mouse-coordinates to a real file-offset-coordinate
    MouseCoordToFilePos(mx, my, &nLineNo, &nCharOff, &nFileOff, &xpos);

    SetCaretPos(xpos, (nLineNo - m_nVScrollPos) * m_nLineHeight);

    // erase any existing selection
    InvalidateRange(m_nSelectionStart, m_nSelectionEnd);

    // reset cursor and selection offsets to the same location
    m_nCursorOffset   = nFileOff;
    m_nSelectionStart = nFileOff;
    m_nSelectionEnd   = nFileOff;

    // set capture for mouse-move selection 
    m_fMouseDown = true;

    SetCapture(m_hWnd);
    return 0;
}

There are two basic tasks that must be performed. The first is to identify which text-character within the file has been clicked - we need to retrieve the zero-based file-offset of this selected character (so we can keep track of the current cursor position). The second task is to position the text-caret next to the character we selected in step#1.

These complex operations have been isolated inside a single TextView member-function, MouseCoordToFilePos. The purpose of this function is to return the line number, character offset within that line (i.e. the column number), physical file offset, and finally the x-coordinate of the character as it appears on screen.

BOOL MouseCoordToFilePos (int    mx,            // mouse x-coordinate
                          int    my,            // mouse y-coordinate
                          ULONG *pnLineNo,      // [out] line number
                          ULONG *pnCharOffset,  // [out] column number 
                          ULONG *pnFileOffset,  // [out] file-offset
                          int   *px);           // [out] adjusted x coordinate

I'm not sure if I really want to include the full code this function here because it is likely I will keep tweaking it to try and make it as clear/simple as possible. However I do want to describe the basic operation so people can understand exactly what is involved in this operation.

Find the line-number

The first thing to do is work out which line of text is under the mouse. This is actually very straight-forward because we are using fixed-height lines of text (i.e. every line of text is the same).

ULONG lineno = (my / m_nLineHeight) + m_nVScrollPos;

If we were writing a word-processor or a HTML viewer the operation above would be alot more complex, but for our simple text-editor it is sufficient to just divide the mouse Y-coodinate by the line-height. As you will see below, knowing what line we are currently looking at is very important because we must parse each specific line of text in order to calculate the cursor x-position.

The GetTextExtent problem

Once we know what line of text we are dealing with, we must work out which character within that line has been selected. Due to the possibility of tabs and control-characters occuring in the line of text (or if we are using variable-width fonts), we must parse the entire line of text (from the start) to work out which character falls under the mouse.

In Windows there are many APIs which tell you how big a string of text is (in pixels). However there are no APIs which perform the opposite conversion - i.e. how many characters fit within the specified space. Starting with Windows 2000 two new routines were introduced to address this problem - GetTextExtentPointI and GetTextExtentExPointI.

However we can't rely on just Windows 2000, and due to the GUI-file separation of the TextView/TextDocument (i.e. accessing the content in chunks), we must devise our own strategy to work out which character has been selected.

We start by accessing the line of text in fixed-sized blocks. As we proceed in this manner, the width of each block of text is calculated using a function NeatTextWidth (which is basically a wrapper around GetTextExtentPoint32, but also takes into account tabs and control-characters). The mouse x-coordinate is then checked against this block of text to see if it falls inside.

The picture above should hopefully illustrate this process fairly clearly. It's not at all accurate (and it isn't intended to be). All we want at this stage is rough guess as to where the mouse has been placed. The code snippet below is basically what is happening here:

int curx    = 0;
int charoff = 0;

for(;;)
{
    // grab some text
    if((len = m_pTextDoc->getline(nLineNo, charoff, buf, TEXTBUFSIZE, &fileoff)) == 0)
        break;

    // find it's width
    int width = NeatTextWidth(hdc, buf, len, -(curx % TABWIDTHPIXELS));

    // does the cursor fall within this segment?
    if(mx >= curx && mx < curx + width)
    {
        // narrow down the search
    }

    // move onto the next block
    curx    += width;
    charoff += len;
}

Once the correct block of text has been identified we must narrow down the search using what is essentially a "binary chop" algorithm.

Binary-Chop

We are now working at the single character level. For efficiency reasons I really don't want to call NeatTextWidth for each character in turn so the binary-chop (or binary-search) is perfect for this situation. The diagram below shows the algorithm in action.

We keep track of the search using two variables, low and high, which specify offsets into the character-buffer we are searching. These offsets start at the two extreme ends of the buffer and then move inwards, narrowing the search down each interation.

For each iteration, we take the mid-point between low and high. We then compare the mouse coordinate to see which side of this mid-point the cursor falls. If it is to the left then we center in on this segment, and likewise if it is to the right of this midline. We will eventually get to the point where we have closed in on a single character (low is exactly one less than high) with the mouse somewhere in this small range of pixels.

int low   = 0;
int high  = len;
int lowx  = 0;
int highx = width;

while(low < high - 1)
{
    int newlen   = (high - low) / 2;

    width = NeatTextWidth(hdc, buf + low, newlen, -lowx-curx);

    if(mx - curx < width + lowx)
    {
        high  = low + newlen;
        highx = lowx + width;
    }
    else
    {
        low   = low + newlen;
        lowx  = lowx + width;
    }
}

In computer science terms this method has an efficiency of O(log2n) - actually a binary search is a very efficient algorithm, and even for very long lines of text it should still be quite fast. For variable-width fonts there really isn't any other way to perform this type of thing. Obviously for a fixed-width font display, we could simply scan through the whole line in one go but I won't bother "over-optimizing" just yet because it will just clutter the code up.

Snap to middle of character

It is at this point that we know which character has been clicked/selected with the mouse, and we have the x-coordinates of this character's starting and ending positions, in the lowx and highx variables:

The final detail to be implemented is to determine which side of the character to place the text-cursor (caret). Sloppy text-editors simply "round-down" to the start of each character (i.e. they just position the cursor at the start of the character by choosing the lowx coordinate). However a more natural "feel" can be achieved by using the center of each character to decide which side to place the caret.

if(mousepos > highx - FontWidth/2)
    caret = highx;
else
    caret = lowx;

Notice that the "TAB" character shown above has the selection-line positioned on the right-hand-side rather than the middle. This is a deliberate detail because I want to emulate the way Visual Studio places the cursor when it is positioned over a TAB (or any control-character which is also wider than a single letter).

Selecting with the mouse

Now that we are able to position the text-caret under any character within the TextView, we are ready to move onto mouse-selection. Cast your memory back to what we do when we process WM_LBUTTONDOWN - the m_nCursorOffset, m_nSelectionStart and m_nSelectionEnd variables were all set to point to the same location.

To extend the selection as we drag the mouse we can handle the WM_MOUSEMOVE message. Again we retreive the file-offset under the mouse using MouseCoordToFilePos. Now however, we can modify just the m_nSelectionEnd variable to "point" to this new offset, leaving m_nSelectionStart where it is. This has the effect of extending the selection. To have this reflected on the screen we must obviously redraw the display, and this is where the tricky part comes in.

LONG TextView::OnMouseMove(UINT nFlags, int x, int y)
{
    if(m_fMouseDown)
    {
        ULONG nLineNo, nCharOff, nFileOff;
        int px;

        MouseCoordToFilePos(x, y, &nLineNo, &nCharOff, &nFileOff, &px);

        // update the area that has changed
        if(m_nSelectionEnd != nFileOff)
        {
            InvalidateRange(m_nSelectionEnd, nFileOff);

            SetCaretPos(px, (nLineNo - m_nVScrollPos) * m_nLineHeight);

            m_nSelectionEnd = nFileOff;
            RefreshWindow();
        }
    }
}

The WM_MOUSEMOVE handler (above) is quite similar to the WM_LBUTTONDOWN handler. We first translate the mouse x,y coordinates to a file-offset. Assuming that this offset is different to the current cursor offset, we can reposition the text-caret and redraw the area of text between the old selection-end point and the new cursor position.

Invalidate a range of text

The key to a good selection/highlighting strategy is to only redraw the bare minimum at a time - we must only paint where there are changes and never anywhere else - simply to avoid flicker rather than for performance. The InvalidateRange member-function does exactly this.

LONG TextView::InvalidateRange(ULONG nStart, ULONG nFinish);

The two parameters (nStart and nFinish) specifies the range as file-offsets. It is the job of InvalidateRange to convert these two parameters to screen coordinates and cause just the specified region to be redrawn. This is the exact "opposite" to MouseCoordToFilePos - we are moving from file offsets back to screen coordinates now. You can use whatever coordinate system you want (file-offsets or text-coordinates) - it really doesn't matter. It is the concept of limiting the redraw to the change in selection that is important here.

Note that the display isn't actually redrawn using this function - the specified area is instead invalidated (using a series of calls to InvalidateRect), and the task of doing the actual drawing is left up to the WM_PAINT handler which we have already implemented.

Even though our WM_PAINT handler redraws whole lines at a time, it is because we have invalidated a specific area the update region for the window will clip our output and prevent us drawing over areas that didn't change.

The picture above is meant to illustrate a "selection in progress". The selection has been made in four steps, starting with the lightest blue segment. The basic idea is to break the task up into lines, calling InvalidateRect for each span of text. What I am trying to show is that a selection-change could be a small segment on one line, or a change involving multiple lines at the same time.

The InvalidateRange function (however it is implemented) must be able to handle these different scenarios correctly. There is little point in including the function body here so it is time to finish this part of the tutorial.

Coming up in Part 6

Mouse selection is quite a tricky subject but I hope I have covered it adequately for people to appreciate what is required for such a task. Also remember that this is a simple text editor - imagine how much more difficult it would be to write a real word processor or web-browser which has to handle many different types of text and graphics.

Part 6 will take what we have implemented here and add "mouse scrolling" to the equation.