Using a Global Counter to Process an Exact Number of Records

Kitewheel is designed to process a large number of transactions simultaneously in parallel. This provides one issue that if you wish to process or decision an exact number of records, perhaps you have a limited number of offers to give away, then creating a stopping condition is hard due to the massively parallel nature of the processing.

Public variables appear to offer a way to create a counter that can be used check against the desired total to be processed, however the current implementation of public variables means that any graph has to do a read (get) before set and so the time between the read and the setting of that variable means that all of the other currently running graphs can read and attempt to update the counter before any stopping condition is met.

For this reason we need a device that will provide a guaranteed locking mechanism as there can only be one value present at a time. The only practical mechanism to do this is to use a database with an auto-increment field that can be used to create the counter.

It is worth pointing out that any single point of coordination in a graph such as this will necessarily create a choke point and will slow down processing overall. That is traded off against the functionality of being able to process an exact number of records.

Process

  1. Create a tracking table with an auto-increment field and a timestamp field only

  2. Inside the graph insert a value into the tracking table first (this gets an immutable unique key that is not shared with any other graph) and get the auto-increment value and the allocated timestamp

  3. Select the number of items from the tracking table that are below or equal to the auto-increment (and presumably within some time period such as the last 12/24 hours)

  4. If the number of items is still within bunds then continue to process

  5. If the number of items is above the specified bound then exit the graph

 

For performance reasons we want to ensure that the queries on the table will be as performant as possible. This means that the auto-increment and the timestamp field should be auto-generated and there should be an index on the timestamp field to ensure that the counts return quickly. Note that you can’t just subtract the auto-increment field values as they may not be allocated incrementally (though they usually are). In MySQL an auto-increment field is automatically the primary key.

Table Design

 

create table tracking( tid integer primary key auto_increment, ts timestamp(3) default current_timestamp(3), nonce char(1) ); create index tracking_ts_idx on tracking(ts);

 

The nonce field is required because Kitewheel has an issue inserting a statement with an empty values() expression.

Graph Design

 

  1. isEligible is just a random number generator that chooses whether this individual is eligible for the offer or not

  2. getToken - inserts a record into the tracker table and retrieves the primary key / auto increment field

    insert into tracking(nonce) values ('Y');
  3. getPrimaryKey gets the zeroth value returned from the primaryKeys return from the database insert (you can have more than one auto increment field in a database table which is why it returns an array)

    return primaryKey[0]
  4. getCount returns the number of offers that have been made (not including this potential one) and also uses an optional time window to control that:

  5. Get Records - branch if the threshold has been exceeded

  6. createResponse just formats the response to say how many of “n” have been consumed - this is where the allowed business logic would be called.

Recommendations

There is a downside in that the second database query will slow down as the table grows. So it is recommended that this table is truncated on a regular basis. On a well configured basis the two database queries should be less than 100 milliseconds.

 

Privacy Policy
© 2022 CSG International, Inc.