hi @anton,
Here I got a new question. The Stream Enrich has been successfully processed data into my enriched stream. Now I think I need to set up S3 Loader to get data into S3 bucket. I have created a bucket for that, and put values into configuration file.
# Default configuration for s3-loader
# Sources currently supported are:
# 'kinesis' for reading records from a Kinesis stream
# 'nsq' for reading records from a NSQ topic
source = "kinesis"
# Sink is used for sending events which processing failed.
# Sinks currently supported are:
# 'kinesis' for writing records to a Kinesis stream
# 'nsq' for writing records to a NSQ topic
sink = "kinesis"
# The following are used to authenticate for the Amazon Kinesis sink.
# If both are set to 'default', the default provider chain is used
# (see http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/auth/DefaultAWSCredentialsProviderChain.html)
# If both are set to 'iam', use AWS IAM Roles to provision credentials.
# If both are set to 'env', use environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY
aws {
accessKey = "xxx"
secretKey = "xxx/"
}
# Config for NSQ
# nsq {
# Channel name for NSQ source
# If more than one application reading from the same NSQ topic at the same time,
# all of them must have unique channel name for getting all the data from the same topic
# channelName = "{{nsqSourceChannelName}}"
# Host name for NSQ tools
# host = "{{nsqHost}}"
# HTTP port for nsqd
# port = {{nsqdPort}}
# HTTP port for nsqlookupd
# lookupPort = {{nsqlookupdPort}}
# }
kinesis {
# LATEST: most recent data.
# TRIM_HORIZON: oldest available data.
# "AT_TIMESTAMP": Start from the record at or after the specified timestamp
# Note: This only affects the first run of this application on a stream.
initialPosition = TRIM_HORIZON
# Need to be specified when initialPosition is "AT_TIMESTAMP".
# Timestamp format need to be in "yyyy-MM-ddTHH:mm:ssZ".
# Ex: "2017-05-17T10:00:00Z"
# Note: Time need to specified in UTC.
# initialTimestamp = "{{timestamp}}"
# Maximum number of records to read per GetRecords call
maxRecords = 1000
region = "us-east-1"
# "appName" is used for a DynamoDB table to maintain stream state.
appName = "s3loader-test"
}
streams {
# Input stream name
inStreamName = "Stream-Enriched-Good"
# Stream for events for which the storage process fails
outStreamName = "S3-Process-Fail"
# Events are accumulated in a buffer before being sent to S3.
# The buffer is emptied whenever:
# - the combined size of the stored records exceeds byteLimit or
# - the number of stored records exceeds recordLimit or
# - the time in milliseconds since it was last emptied exceeds timeLimit
buffer {
byteLimit = 1048576
# Not supported by NSQ; will be ignored
recordLimit = 100
timeLimit = 60000
# Not supported by NSQ; will be ignored
}
}
s3 {
region = "us-east-1"
bucket = "enrich-s3-loader"
# Format is one of lzo or gzip
# Note, that you can use gzip only for enriched data stream.
format = "gzip"
# Maximum Timeout that the application is allowed to fail for
maxTimeout = 120000
}
# Optional section for tracking endpoints
# monitoring {
# snowplow{
# collectorUri = "{{collectorUri}}"
# collectorPort = 80
# appId = "{{appName}}"
# method = "{{method}}"
# }
# }
I double checked all values and the same access keys are able to connect to streams and create tables in DynamoDB as well.
After I run it, I got this error.
configuration error: ConfigReaderFailures(KeyNotFound(nsq,Some(ConfigValueLocation(file:/home/ec2-user/loader2.conf,1)),Set()),List())
Do you have any advice?
Thank you!