Skip to content

S3 plugin not functioning correctly for GZ files from Firehose #180

@apatnaik14

Description

@apatnaik14

I was testing the s3 plugin for a production POC where a Firehose delivery system is delivering Cloudwatch logs into an S3 bucket from where I am reading it with the S3 plugin into logstash

My logstash config is as below:

input {
s3 {
bucket => "test"
region => "us-east-1"
role_arn => "test"
interval => 10
additional_settings => {
"force_path_style" => true
"follow_redirects" => false
}
}
}

output {
elasticsearch {
hosts => ["http://localhost:9200"]
sniffing => false
index => "s3-logs-%{+YYYY-MM-dd}"
}
stdout { codec => rubydebug }
}

As I start up logstash locally, I can see the data reaching to logstash but its not in proper format, like below.

{
"type" => "s3",
"message" => "\u001F�\b\u0000\u0000\u0000\u0000\u0000\u0000\u0000͒�n\u00131\u0010�_��\u0015�����x���MC)\u0005D\u0016!**************************************",
"@Version" => "1",
"@timestamp" => 2019-07-12T15:32:37.328Z
}

I also tried adding a codec => "gzip_lines" into the configuration, but then logstash was not able to process those files at all. The documentation suggests S3 plugin is supposed to support GZ files out of the box. I was hoping if anyone could point out what I am doing wrong?

Regards,
Arpan

Please find below version and OS information.

  • Version: Logstash 7.1.1 (Plugin logstash-input-s3-3.4.1)
  • Operating System: Ubuntu 17.04
  • Config File (if you have sensitive info, please remove it): Added above
  • Sample Data: N.A
  • Steps to Reproduce: Mentioned above.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions