Elasticsearch 5.5 TTL configuration


#1

Hi All,

Currently I migrated to Elasticsearch 5.5. My whole system is running again but I can’t figure out a way to enable the TTL again since it has been removed from ES5. Is there an alternative configuration to keep only the documents for an X amount of time?

Thanks in advance!


#2

It looks like the recommendation from Elasticsearch is to either use time-based indices or externally schedule a process to remove documents based on timestamp.

The _timestamp and _ttl fields were deprecated and are now removed. As a replacement for _timestamp, you should populate a regular date field with the current timestamp on application side. For _ttl, you should either use time-based indices when applicable, or cron a delete-by-query with a range query on a timestamp field

via https://www.elastic.co/guide/en/elasticsearch/reference/5.6/breaking_50_mapping_changes.html#_literal__timestamp_literal_and_literal__ttl_literal


#4

Hi @marien,

Our recommendation is to use time based indices. Internally we use daily indices which which allow us to easily control the amount of data in the cluster as well as providing the ability to change shard counts overtime if your event volumes change.

The way to do this with the Elasticsearch Sink is to:

  1. Setup an Alias for a daily index
  1. Use this alias in your sink configuration
  2. Update your alias each day after creating your new index
  • NOTE: You will need to ensure that your alias only points to 1 index for the sink to be able to work!

While the removal of the TTL involves more management it does make the cluster a lot more efficient at scale as it is not constantly searching for data to expire.

Hope this helps!


#5

Thanks for the tips. They are really helpful. Looks like I have to learn something about the aliases.