Product
Support
Everything Else
Understanding Keyword Indexing and the HKWT (Helix Keyword Separator Table) Resource
About Keywords

In Helix, a keyword is any contiguous series of characters that do not include a Keyword Separator. Helix uses the term “keyword” and not simply “word” because any character can be set to act as a word character. Characters not defined as word characters are called keyword separators, as they determine the breaks between keywords.

About the Helix Keyword Separator Table

Helix maintains an internal table that specifies which characters are treated as word characters vs. keyword separators. This table is know as the Keyword Separator Table is stored in a HKWT resource in the resource fork of the application and/or collection. This table can be modified by the end user, but there is no built-in interface to it, and so is typically overlooked or restricted to use by technically savvy users.

Helix users may desire to modify the Keyword Separator Table in order to change the behavior of Helix’s Mixed Case ◊ and word ◊ … tiles, and keyword-based queries. For example, a designer that wishes to allow keyword searches on names such as “O’Malley” could modify the Keyword Separator Table to include the apostrophe as a word character.

Application-Based Rules Issue in Helix 6.0 & Prior

In Helix 6.0 and prior, the HKWT resource that defines the Keyword Separator Table is stored within the Helix application, not the collection. If a collection relies on a modified Keyword Separator Table, the user must remember to update the HKWT resource with every new version of Helix installed. Failure to maintain the HKWT resource results in inconsistent results in Keyword-based searches.

Starting in Helix 6.1 (and RADE 6.1.1): Each collection can now optionally have its own HKWT resource, providing a customized Keyword Separator Table. Helix now checks for the HKWT resource inside the collection before loading the default HKWT from the application. Note, however, that the HKWT resource is not automatically added to new collections: it must be explicitly copied into the collection when collection-level specificity is desired. (See Keyword Management Utility below.)

Missing Characters Issue

In Helix 6.0 and earlier, many ‘High ASCII’ characters that are used frequently in European languages were being ignored as word characters. Consequently, words containing characters such as Å and Ø were excluded from keyword searches.

These omissions — including the fi & fl ligatures — have been corrected in Helix 6.1.

Non-Breaking Space: Curiously, even the non-breaking space (NBSP) character was not considered a word character before, even thought the very definition of the non-breaking space speaks for its inclusion. The non-breaking space — created by pressing option-space — is now treated as a word character.

Note: These changes first appeared in Helix Client/Server and Helix Engine 6.1, but not in Classic Helix RADE until release 6.1.1.

Note: The German Eszett: (ASCII character 0xA7) was not added to the Word Character list until Helix 6.1.3 (all products).

Inaccurate Documentation Issue

It was also discovered that the previously published information, such as that found in Appendix A of The Helix Reference is inaccurate. For example: É is noted as being a word character, but in the actual HKWT resource, it was being treated as a separator.

This technote corrects the documentation and represents the official published specification for the Helix Keyword Separator Table in Helix 6.1 and later.

Important Note on Keyword Index Changes

When the Keyword Separator Table is changed, Keyword Indexes in all affected collections must be rebuilt. Otherwise, pre-existing entries that were created while the old table was in effect remain in the index, and the index will be unreliable. Currently there is no code in Helix to detect this situation and automatically rebuild keyword indexes. Helix Utility includes a Break All Indexes command, but that also breaks regular field indexes, which are not affected by this change.

If the Keyword Index is not rebuilt, this problem can occur: when a keyword field is modified, the words in the field are added to the Keyword Index based on the new table. However, the old entries for that field, based on the old table, are not removed from the index. The keyword reindexing code searches for words to remove based on the current table, as it has no access to the table that was in effect when the data was previously entered. This can cause duplicate ‘hits’ when doing keyword searches.

In summary: when changing the Keyword Separator Table, be sure to rebuild all existing Keyword Indexes in any affected collections. Helix 6.2 (or later) users can easily rebuild just the Keyword Indexes in a collection with the Rebuild Keyword Indexes script available on the AppleScripts for Helix page.

Default Keyword Separator Tables, with Revisions for Helix 6.1

These are the official, accurate tables for various versions of Helix. Those found in older references should be discarded.

Note: The Keyword Separator Table changes in Helix 6.1. page highlights the changes made in Helix 6.1 by specifically noting the changed characters.

Keyword Management Utility

A simple utility has been developed to help manage HKWT resources. The Keyword Management Utility is an AppleScript that can:

  • Create a map of an existing HKWT
  • Copy an HKWT from one location to another
  • Modify an existing HKWT
  • Rebuild Keyword Indexes (Helix 6.2 or later only)

Click here to download the Keyword Management Utility.

Direct Editing

You can also edit the HKWT resource directly using a resource editor. See our Resource Editing page for information on resource editors.

Changes In macOS

Classic Helix allows you to specify any search string when specifying keyword-based restriction using the Word Starts With & Word Equals operators in Form and Power Queries, even logically impossible strings that contain Separator characters. macOS Helix checks the search term against the Keyword Separator Table and reports an error when Separator characters are included. (In Classic a query can be specified that can never produce results because keyword separator characters are included.)

Definitions

Terms Used In This Technote:

  • Keyword Field: A field with the Keyword checkbox turned on.
  • Keyword Index: An internal index built automatically when a field is designated as a Keyword Field. Unlike regular field indexes, keyword indexes are not visible in Design Mode. A keyword index enables Keyword-based searches.
  • Keyword-Based Searches: Searches that use the Word Starts With or Word Equals operator. These operators are available in Form and Power Queries and in abaci.
  • HKWT Resource: A Macintosh OS resource, stored in the resource fork of a Helix application or collection. The HKWT resource must have ID#1.
  • Keyword Separator Table: The contents of the HKWT resource. A table that indicates whether each ASCII characters is to be treated as a Word Character or a Separator Character
  • Word Character: An ASCII character that is included in keywords. The default set includes all numbers, letters (including their high-ASCII variations), and the non-breaking space (NBSP) character.
  • Separator Character: An ASCII character that is excluded from keywords. The default set excludes all punctuation, control characters, and non-letter, high-ASCII characters.
  • ASCII Character: The standard characters as defined in the Mac Roman Character Set.
The Helix Reference

Keywords are described in the following sections of The Helix Reference:

  • 5.1.1.3: Keyword/Edit Keyword
  • 7.1.6: Keyword Option
  • Appendix A (obsolete)