|R8323: A Query Restriction that Appears to Fail|
A list or subform may show records that violate the underlying query in Helix Client/Server.
During the testing of Helix Client/Server 6.2.1, it was observed that the ‘HelixChat’ window would occasionally expand to show a second page. The second page would remain for a time, but eventually the view would revert to the normal single page design.
Investigation of this determined that this is not a new phenomenon, and is one that can be explained by understanding the construction of the view and the underlying principals at work in Helix Client/Server
|How HelixChat is Built||
HelixChat is a single view containing a subform that displays the four (4) most recent records entered. The ‘wrapper’ is an entry view in an inert relation, with messages posted to another relation for permanent storage. As new messages (i.e: records) are posted, they are assigned a sequential number using the traditional max + 1 method.
The ‘magic’ of the HelixChat window is in the query that restricts the subform to showing just the most recent four records. This query is, essentially [ F'RecordNumber' ≥ [ [ Max F'RecordNumber' ] - CN'3' ] ] (using AEL nomenclature).
This query should ensure that exactly four records satisfy the query, and therefore that only the last four records are displayed in the subform. However as our testing demonstrated, this query does not always produce the expected result.
This technote provides an explanation for this discrepancy, and why it should not be considered a bug, per se.
The summary explanation is that, because the record number that is being assigned to each new record added to the list is also used in the query that restricts the list, the process of re-evaluating the query and redrawing the subform in real-time could cause a situation where the subform was always updating, and never drawing because new data was coming in during the redraw process.
The model that Helix Server uses is that if a new record is added during the short interim while a search is being conducted, the search is completed and the results — including records that satisfied the query both before and after the record was added — are sent to the Client.
The alternative — to throw the search results away and start a new list from scratch each time a piece of data changed. This could result in serious delays in data updating when there is heavy data entry. It would be theoretically possible for a Client to be perpetually waiting for a query to reach completion, and never see data. Of course, in reality this would require a poorly optimized query — one that takes seconds to evaluate instead of milliseconds — and a very intensive data entry session, but because it could happen, we choose the route that ensures that the Client does not appear to be perpetually hung waiting for data.
|Why the Extra Page Appears: a Technical Explanation||
Special thanks go to Steve Keyser, the original implementor of Client/Server Helix and our current implementation director, for much of the technical information that follows.
The problem has to do with the multi-threaded nature of Helix Client/Server. Consider this sequence of events:
In the following explanation, assume that record 105 is the current maximum record in the relation, and that record 106 is the one being added. Our query is for [ Max RecordID ] - 3 which would initially be RecordID 102–105, inclusive.
To understand what happens, it is best if we look at things from the Server’s point of view, and for that, we need to establish a little background knowledge… The Server does some query optimization before it starts attempting to narrow down the number of records it needs to look at, but the optimization is just a best guess. The Server re-evaluates the query on each record that it looks at to decide whether to include it in the list it is building. Data entry during this time, interrupts the search in progress and is processed as a distinct operation, after which the search resumes where it left off.
In the multi-tasking ServerŐs world, it handles requests in a round-robin fashion, spending a little time on each. But any delay on a task (slow network, disk access, etc.) causes the Server to immediately switch tasks and look for another pending request. Thus even a query that only looks at a few records may be interrupted with something else for to do while waiting for data to be read from the collection.
Finally, the Server can not predict whether any particular search is going to execute quickly or take a very long time. A new task (on a new thread) is started up for each search so the Server doesn’t become unresponsive during a lengthy (i.e: unoptimized) search.
On a busy network “something else” can, and often does, happen between the time the search request is received and the search actually starts. If that something else is data entry in the range of records — or effects the range of records — that the search is going to look at, how it is handled depends on where the server was in that range of records when the data change happens.
Case 1: When the search request arrives at the Server, the Max record is 105, but the new record (106) arrives before the Server starts the search. When the Server starts the search, it considers the current state of the data and searches based on a Max record value of 106. It finds records 103, 104, 105, & 106, and returns the four expected records. All is well.
Case 2: When the search request arrives at the Server, the Max record is 105, the search begins, and the list of records that satisfy the query is started. But before the search is completed, the new record (106) arrives. The Server is already conducting a search for RecordID ≥ 102, and has already added some of those records (e.g. 102 & 103) to the results list. After the entry is completed, the search resumes, it finds and adds records 104, 105, and the new record 106, all of which satisfy the query! The critical fact here is this: if the Server has already added records to the query results list, it will not go back and re-evaluate those records in order to remove ones that do not satisfy the revised query.
Instead, Server relies on the “Change-Record” system (see below) to repair the list. This works well for records added, deleted, and replaced in a list — unless those records actually change the query! When that happens, it is the Client’s responsibility to request that the Server recompute the list.
A “Change-Record” is a message sent from Server to every Client, informing each one that a record has been added, deleted, or modified. The Client checks its open windows to see if any of the data presented relies on that change. If it finds that a change might impact a particular view, that view is ‘invalidated’ (the checkerboard pattern obscures the data) and the Client requests up-to-date data from the Server.
So now let’s look at it from the standpoint of the Client. The Server is about to deliver a list for which the query changed during its building. The Client first receives a “Change-Record” informing it that “Record 106 has been added to the relation.” The Client notes that this will change the query and therefore that the query needs to be recomputed. But the Client is already in the middle of receiving a new list from the Server, and it can not know whether the list coming from the Server is “Case 1” or “Case 2” as described above.
If it is Case 1 — that the Server processed the new record before starting the search for the Client — then the resulting list is up-to-date, and exactly what the user expects to see.
If it is Case 2 — that the Server processed the new record after starting the search for the Client — then the resulting list is up-to-date but may also include some stale data, and is not what the user expects to see.
The problem is that the Client doesn’t know which case it is. For the Client to know, the Server would need to tell the Client that, when the search began the ‘Max Record’ was 105, and the Server would also have to defer any Change-Record greater than that until after the search completed.
Helix does not do this — throw the search away and start a new one — because the result could be serious delays in data updating when there is heavy data entry. It would be theoretically possible for a Client to be perpetually waiting for a query to reach completion, and never see data. Of course, in reality this would require a poorly optimized query — one that takes seconds to evaluate instead of milliseconds — and a very intensive data entry session, but because it could happen, we choose the route that ensures that the Client does not appear to be perpetually hung waiting for data.
Considering that this is an ‘edge case’ — it is a query construction that is quite rare — we are not planning on addressing this through additional Client/Server overhead.
This is a known issue in all versions of Helix Client/Server.