Restarting Enrichment

Hi ya,

Our enrichers didn’t process some data due to a silly mistake on my end. Fortunately, it is still in the kinesis streams. It seems enrichers start points are set two ways:

  1. The config.hocon lets you specify a starting position (LATEST, etc)
  2. Dynamodb stores positions (I believe)

When killing and restarting the enrichment process, which of the above takes precedence? Do I need to delete the dynamodb entry before in order for the config entry to work?

Bonus question: If I were to do initialPosition = AT_TIMESTAMP in the config where is it I specify the beginning timestamp in the config?

Thanks much,
Patrick

Hi @pcb,

This is the class from AWS SDK with the different values for the starting position :

  /**
 * Used to specify the position in the stream where a new application should start from.
 * This is used during initial application bootstrap (when a checkpoint doesn't exist for a shard or its parents).
 */
public enum InitialPositionInStream {
    /**
     * Start after the most recent data record (fetch new data).
     */
    LATEST,

    /**
     * Start from the oldest available data record.
     */
    TRIM_HORIZON,

    /**
     * Start from the record at or after the specified server-side timestamp.
     */
    AT_TIMESTAMP
}

As you can see, once checkpoints are existing in DynamoDB, this value will be discarded and the stream will be read from the last committed record. In your case, you probably want to delete the DynamoDB table and use AT_TIMESTAMP. The beginning timestamp needs to be specified here.

Awesome thanks Ben!

That link was super helpful, I must have been working off an old template as it doesn’t have all that info.

Thanks again,
Patrick