Filebeat Configuration Best Practices Tutorial - Coralogix (2023)

In this post, we'll cover some of the key use cases Filebeat supports and look at various Filebeat configuration use cases.

Filebeat, an Elastic Beat based on Elastic's libbeat framework, is a lightweight sender for forwarding andcentralized log data. Installed as an agent on your servers, Filebeat monitors log files or locations you specify, collects log events, and forwards them to Elasticsearch for indexing orLogstashfor further processing.

Installing Filebeat

Filebeat installation instructions can be found atelastic website.

Here are the Coralogix Filebeatsinstallation instructions.

Coralogix also has aFilebeat.comK8s option out of the box.this documentdescribes how to set up the Coralogix integration with Kubernetes.

Filebeat Settings

To configure Filebeat, edit the configuration file. For rpm and deb you will find the configuration file in this location/etc/filebeat. there is also afull example configuration fileNo/etc/filebeat/filebeat.reference.ymlwhich shows all non-deprecated options. Filebeat Configuration File UsesYAMLby its syntax, as it is easier to read and write than other common data formats such as XML or JSON. The syntax includes dictionaries, an unordered collection of name/value pairs, and also supports lists, numbers, strings, and many other data types. All members of the same list or dictionary must be indented at the same level. Lists and dictionaries can also be represented in a shorthand form, which is somewhat similar to JSON using{}for dictionaries and[]for lists. For more information onconfiguration file format.

The Filebeat configuration file mainly consists of the following sections. For more information on howconfigure filebeat.

  1. OModulesThe settings section can help to collect, analyze and view common log formats (optional).
  2. OAperitifThe section determines the input sources (required as input if module configuration is not used).
  3. OprocessorsThe section is used to configure the processing of data exported by Filebeat (optional). Canset a processorglobally at the top level of the configuration or on a specific entry for the processor to apply to the data collected for that entry.
  4. OSalidaThe section determines the output destination of the processed data.

There are other sections you can include in your YAML, such as a Kibana endpoint, internal queue, etc. You can see them and their different options in theset file heartbeatlink. Each of the sections has different options and there are multiple modules to choose from and multiple input types, different outputs to use, etc.

In this post, I'll cover the main sections you can use and focus on giving examples that have worked for us here at Coralogix.

Modules

Filebeat modules make it easy to collect, analyze, and display common record formats. A module is made up of one or more filesets, each fileset contains the Filebeat input configuration, Elasticsearch ingest node pipeline definition, field definitions, and sample Kibana dashboards ( if they are available). See here for more information onModules from Filebeat.

Aperitif

If you are not using modules, you will need to configure Filebeat manually. This is done by specifying an input list under thefilebeat.entriesSection fromfilebeat.ymlto tell Filebeat where to find and how to process the input data. There are different types of inputs you can use with Filebeat, you can learn more about the different options atconfigure inputsdoc. In the following example, I'm using the Record input type with some common options:

#===================== Filebeat Entries ============================ = ==========# List of inputs to get data.filebeat.inputs:#---------------------- -- Log input - - -------------------------------- type: log # Set to true to enable this input setting. enabled: true # Routes to be traced and retrieved. glob-based routes # To get all ".log" files from a specific level of subdirectories, /var/log/*/*.log can be used. # For each file found in this path, start a collector. # Make sure there are no files defined twice, as this can cause unexpected behavior. paths: - "/var/log/nginx/access*.log" #- c:\programdata\elasticsearch\logs\* # Configure the file encoding to read files with international characters following the W3C recommendation for HTML5 (http: / / www.w3.org/TR/encoding). # Some sample encodings: # plain, utf-8, utf-16be-bom, utf-16be, utf-16le, big5, gb18030, gbk, # hz-gb-2312, euca-kr, euc-jp, iso- 2022 -jp, shift-jis, ... encoding: simple # Include lines. A list of regular expressions to match. Exports lines that match any regular expression in the list. # include_lines is called before include_lines. By default, all rows are exported. include_lines: ['^ERR', '^WARN'] # Exclude lines. A list of regular expressions to match. Discards lines that match any regular expression in the list. # include_lines is called before include_lines. By default, no rows are dropped. exclude_lines: ['^DBG'] # Exclude files. A list of regular expressions to match. Filebeat discards files that match any regular expression in the list. # By default, no files are discarded. exclude_files: ['1.log

There are a few more options for this input type, as you can see in the full example configuration file. Of course, you must provide the paths to your files, and if you send your logs to Coralogix, you must use the field options. Other than that, the most common option is the Multiline option, as it allows you to merge log messages that span lines (eg Java stack traces) based on your definition.Filebeat by default splits between these log lines into a file as soon as it finds\norte. can include multiple"-type: record"sections in input if you want to have different multiline patterns for different filesets or if you want to send your logs to Coralogix with many different application/subsystem names or even send them to multiple Coralogix teams all in the same YAML configuration.

processors

You can use processors in orderto process events before they are sent to the configured output. The libbeat library provides processors to reduce the number of exported fields, perform additional processing and decoding, etc. Each processor receives an event, applies a defined action to the event, and returns the event. If you define a list of processors, they will be executed in the order defined in the Filebeat configuration file. This is an example of multiple processors configured. For more information onfiltering and enhancementyour data.

# ========================= Processors ====================== = = ==========# Controllers are used to reduce the number of fields in the exported event or to enhance the event with external metadata. This section defines a list of # processors that are applied one by one and the first one receives the initial # event: ## event -> filter1 -> event1 -> filter2 -> event2 ...## Supported processors are drop_fields, drop_event , include_fields,# decode_json_fields, and add_cloud_metadata.## For example, you can use the following processors to keep fields that# contain CPU load percentages, but remove fields that contain CPU flags# values:#processors: - include_fields: fields: [ " cpu"] - drop_fields: fields: ["cpu.user", "cpu.system"]## The following example drops events that have HTTP response code 200:#processors: - drop_event: when : equals: http. code: 200 ## The following example renames field a to b:#processors: - rename: fields: - from: "a" to: "b"## The following example enriches each event with the local time zone of the machine # offset from UTC.# processors: - add_locale: format: offset## The following example enriches ece each event with host metadata.#threaders: - add_h ost_metadata: ~## The following example decodes fields that contain JSON strings# and replaces the strings with valid JSON objects.#threaders: - decode_json_fields: fields: ["field1", "field2", .. . ] process_array: false max_depth: 1 destination: "" overwrite_keys: false## The following example copies the message value to processors#copied_message: - copy_fields: fields: - from: message to: copied_message error_in_error: true ignore_lost: false# # The following example preserves the raw message in event_original, which is then reduced to 1024 bytes#processors: - copy_fields: fields: - from: message to: event_original fail_on_error: false ignore_missing: true - truncate_fields: fields: - event_original max_bytes: 1024 fail_on_error: false ignore_missing: true## The following url decodes the value from field1 to field2#processors: - urldecode: fields: - from: "field1" to: "field2" ignore_missing: false fail_on_error: true## The following example is excellent method to turn on r-sampling in Filebeat, using the script processor#threaders: - script: lang: javascript id: my_source_filter: > function process(event o) { if (Math.floor(Math.random() * 100) < 50) { event. Cancel (); } }

Filebeat offers more types of processors as you can seehereand you can also includeconditionsin your processor settings. If you use Coralogix, you have an alternative to Filebeat processors up to a point, as you can define different types ofparsing rulesinstead, through the Coralogix user interface. If you're maintaining your own ELK stack or another third-party logging tool, check the processors when you need analysis.

Salida

Configure Filebeat to record to a specific output by configuring the options in the Outputs section offilebeat.ymlconfiguration file. Only one output can be defined. In this example, I'm using the output of Logstash. This is the required option if you want to send your logs to your Coralogix account using Filebeat. For moreoutput options.

# ------------------------------ Logstash output ---------------- - --------------output.logstash: # Boolean flag to enable or disable the output module. enabled: true # Logstash hosts hosts: ["localhost:5044"] # Set HTML escape symbols in strings. escape_html: true # Number of workers per Logstash host. worker: 1 # Optionally load cross-host balancing events Logstash. The default is false. loadbalance: false # The maximum number of seconds to wait before trying to connect to # Logstash after a network error. The default is 60s. backoff.max: 60s # Optional index name. The default index name is set to filebeat # in lowercase. index: 'filebeat' # The number of times to retry posting an event after a post failure. # After the specified number of retries, events are generally discarded. # Some Beats, such as Filebeat and Winlogbeat, ignore the # max_retries setting and retry until all events are posted. Set max_retries to a value less than #0 to retry until all events are posted. Default is 3. max_retries: 3 # The maximum number of events to group into a single Logstash request. The # default value is 2048. bulk_max_size: 2048 # The number of seconds to wait for responses from the Logstash server before # timing out. The default value is 30 s. waiting time: 30s

This example shows just some of theconfiguration optionsfor Logstash output there is more. It is important to note that when you use Coralogix, you specify the following Logstash host:logstashserver.coralogix.com:5044under hosts and that some other options are redundant, like the index name, since we set it automatically.

At this point, we have enough knowledge of Filebeat to start exploring some actual configuration files.. They are commented and you can use them as a reference for additional information on different plugins and parameters or for more information on Filebeat.

Filebeat Configuration Examples

Example 1

This example uses a simple log entry, forwarding only errors and critical log lines to the Coralogix Logstash server (output). The chosen application name is "prd" and the subsystem is "application", you can filter the records later based on these metadata fields. You can add any other custom fields to your records for more filtering options.

# ============================ Filebeat Entries =================== = = =============filebeat.inputs:# Use the log input to read lines from log files - type: log # Path filebeats: - "/var/log/application. log " # Include rows A list of regular expressions to match. Exports lines that # match any regular expression in the list. include_lines is called before #include_lines. By default, all rows are exported. include_lines: ['^CRITICAL', '^ERROR', '^ERR'] # Normally, when set to true, custom fields are stored as top-level fields in the output document instead of being included in a subdictionary of fields . # In Coralogix it's a bit different, if you want to add custom fields you have to define the fields under the root true. # This will also add all fieldbeat metadata similar to the native use of this option. field_under_root: true # These are the required fields for our integration with filebeat fields: PRIVATE_KEY: "xxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" COMPANY_ID: xxxx APP_NAME: "prd" SUB_SYSTEM: "app" # Custom field file name : "application log"# =============================== Logstash output ======== = == = ===========================output.logstash: on: true # output to Coralogix Logstash server hosts: [" logstashserver.coralogix.com:5044 "]

Example 2

This example uses a registry entry, forwardingJSON recordlines to the Coralogix Logstash server (output), using the Decode JSON options. The application name is determined by the value of one of the JSON registry keys, this is done in the processors section.

# ============================ Filebeat Entries =================== = = =============filebeat.inputs:# Use the log input to read lines from log files - type: log # Path to files paths: - "/var/log/filebeat /test . log" # These are the required fields for our integration with filebeat fields: PRIVATE_KEY: "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" COMPANY_ID: xxxx APP_NAME: "test" # Typically when set to true, custom fields are stored as top-level fields in the output document instead of being grouped in a field subdictionary. In Coralogix we are using it in a different way, if you want to add custom fields you have to set the fields to root true. This will also add all the fieldbeat metadata. field_under_root: true # JSON decoding options. Enable this if your logs are structured in JSON. json: # JSON key on which to apply line filtering and multiline configuration. This key # must be top level and its value must be a string; otherwise it will be ignored. If # no text key is defined, the line and multi-line filter functions cannot be used. message_key: "text" # By default, the decoded JSON is placed under a "json" key in the output document. # If you enable this setting, the keys will be copied to the top level of the output document. keys_under_root: true # If keys_under_root and this setting are enabled, the decoded JSON object # values ​​will replace the fields normally added by Filebeat (type, font, offset, etc.) # in case of conflicts. overwrite_keys: true # If this setting is enabled, Filebeat adds an "error.message" and "error.key: json" key in case of # JSON missorting errors or when a text key is defined in the configuration but not You can use . add_error_key: false# ================================ Processors ============= =========================== processors: # This processor will extract the value of a JSON key to the name of the Coralogix subsystem - copy_fields: fields : - from: company to: SUB_SYSTEM fail_on_error: false ignore_missing: true# ============================== Logstash output = === ============================output.logstash: enabled: true # output to Coralogix Logstash server hosts: ["logstashserver. coralogix.com:5044" ]

Example 3

This example uses one log entry, forwards Ngnix access log lines, adds a custom "user" field with Coralogix as the value, and a multi-line pattern to ensure that multi-line log lines (logs spanning a few lines separated by\norte, similar to Java stack traces) will be merged into single log events. The processors section discards some Filebeat metadata fields we don't care about and the output is set to the console, so we'll see our logs in thestandard outlet.

# ============================ Filebeat Entries =================== = = =============filebeat.inputs:# Use the log input to read lines from the log files - type: log # Path to the files paths: - "/var/log/filebeat /access .log" # Adding custom fields fields: user: "coralogix" # When set to true, custom fields are stored as top-level fields in the output document instead of being grouped in a field subdictionary. field_under_root: true # Define multiline pattern: This form of multiline will add consecutive unmatched lines to the previous line that matches multiline: pattern: '^[0-9]{1,3}\.[0 - 9 ]{ 1,3}\.[0-9]{1,3}\.[0-9]{1,3}' negate: true match: after # === ==== ==== ================ Processors ======================= ===== ==processors: - drop_fields: fields: ["input", "beat_host", "ecs", "agent", "tags", "offset"] ignore_missing: true# ======= ===== ==== = ================= Console Output ============================= == == ======output.console: pretty: true

Example 4

This example uses a Redis slow log entry, which forwards the Redis slow log entries to the Coralogix Logstash server (outbound) with a secure connection after downloading the Coralogix SSL certificate.

# ============================= Filebeat Entries =================== = = =====filebeat.inputs:- type: redis # List of hosts to pool to retrieve slow logging information. hosts: ["localhost:6379"] # How often the entry checks the slow redis log. scan_frequency: 10s # Redis authentication password. Empty by default. password: "${redis_pwd}" # These are the required fields for our integration with the filebeat fields: PRIVATE_KEY: "xxxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" COMPANY_ID: xxxx APP_NAME: "filebeat" SUB_SYSTEM: "redis" # In general, When set to true, custom fields are stored as top-level fields in the output document instead of being grouped in a field subdictionary. In Coralogix we are using it in a different way, if you want to add custom fields you have to set the fields to root true. This will also add all the fieldbeat metadata. field_under_root: true# =============================== Logstash output ============= = ===================output.logstash: enabled: true # output to Coralogix Logstash server # If you want to use an encrypted connection, you must add our certificates as described in our tutorial hosts filebeat: ["logstashserver.coralogix.com:5015"] tls.certificate_authorities: ["<path to folder with certificates>/ca.crt"] ssl.certificate_authorities: ["<path to folder with certificates> / ca.crt"]
] # Optional additional fields. These fields are freely selectable to add additional information to tracked log files for filtering # These 4 fields in particular are required for the Coralogix integration with filebeat to work. fields: PRIVATE_KEY: "xxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" COMPANY_ID: XXXX APP_NAME: "ngnix" SUB_SYSTEM: "ngnix" #level: info #site: 1 # Set to true to store the additional fields as top level fields # from under the subdictionary "fields". In case of a name conflict with fields # added by Filebeat itself, the custom fields replace the standard # fields. # In Coralogix we are using it in a different way, if you want to add custom fields you have to define the fields under the root true. This will also add all the filebeat metadata. field_under_root: true ### JSON Configuration # Decode JSON options. Enable this if your logs are structured in JSON. # JSON key on which to apply line filtering and multiline configuration. This key # must be top level and its value must be a string; otherwise it will be ignored. If # no text key is defined, the line and multi-line filter functions cannot be used. json.message_key: "message" # By default, the decoded JSON is placed under a "json" key in the output document. # If you enable this setting, the keys will be copied to the top level of the output document. json.keys_under_root: true # If keys_under_root and this setting are enabled, then the decoded JSON object values ​​# replace the fields that Filebeat normally adds (type, font, offset, etc.) # in case of conflicts. json.overwrite_keys: true # If this setting is enabled, Filebeat adds an "error.message" key and "error.key: json" in case of JSON # sort errors or when a text key is defined in the settings, but # cannot be used. json.add_error_key: false ### Multiline Options # The multiline function can be used for log messages that span multiple lines. This is # common for multi-line Java stack traces or line continuation C: # The regular expression pattern to match. pattern: '^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3} ' # Defines whether the pattern defined under the pattern should be negated or not. The default is false. negate: true # Match can be set to "after" or "before". It is used to define whether lines should be added to a pattern # that was (not) matched before or after, or because a pattern is not matched based on negation. # Note: After is equivalent to previous and before is equivalent to next in Logstash matching: after # The maximum number of lines to match for an event. # If there are more max_lines, the extra lines are discarded. # Default is 500 max_lines: 500 # After the defined timeout, a multiline event is sent even if no new pattern was found to start a new event # Default is 5s. standby time: 5s

There are a few more options for this input type, as you can see in the full example configuration file. Of course, you must provide the paths to your files, and if you send your logs to Coralogix, you must use the field options. Other than that, the most common option is the Multiline option, as it allows you to merge log messages that span lines (eg Java stack traces) based on your definition.Filebeat by default splits between these log lines into a file as soon as it finds\norte. can include multiple"-type: record"sections in input if you want to have different multiline patterns for different filesets or if you want to send your logs to Coralogix with many different application/subsystem names or even send them to multiple Coralogix teams all in the same YAML configuration.

processors

You can use processors in orderto process events before they are sent to the configured output. The libbeat library provides processors to reduce the number of exported fields, perform additional processing and decoding, etc. Each processor receives an event, applies a defined action to the event, and returns the event. If you define a list of processors, they will be executed in the order defined in the Filebeat configuration file. This is an example of multiple processors configured. For more information onfiltering and enhancementyour data.

Filebeat offers more types of processors as you can seehereand you can also includeconditionsin your processor settings. If you use Coralogix, you have an alternative to Filebeat processors up to a point, as you can define different types ofparsing rulesinstead, through the Coralogix user interface. If you're maintaining your own ELK stack or another third-party logging tool, check the processors when you need analysis.

Salida

Configure Filebeat to record to a specific output by configuring the options in the Outputs section offilebeat.ymlconfiguration file. Only one output can be defined. In this example, I'm using the output of Logstash. This is the required option if you want to send your logs to your Coralogix account using Filebeat. For moreoutput options.

This example shows just some of theconfiguration optionsfor Logstash output there is more. It is important to note that when you use Coralogix, you specify the following Logstash host:logstashserver.coralogix.com:5044under hosts and that some other options are redundant, like the index name, since we set it automatically.

At this point, we have enough knowledge of Filebeat to start exploring some actual configuration files.. They are commented and you can use them as a reference for additional information on different plugins and parameters or for more information on Filebeat.

Filebeat Configuration Examples

Example 1

This example uses a simple log entry, forwarding only errors and critical log lines to the Coralogix Logstash server (output). The chosen application name is "prd" and the subsystem is "application", you can filter the records later based on these metadata fields. You can add any other custom fields to your records for more filtering options.

Example 2

This example uses a log entry, forwarding JSON log lines to Coralogix's Logstash server (output), using the Decode JSON options. The application name is determined by the value of one of the JSON registry keys, this is done in the processors section.

Example 3

This example uses one log entry, forwards Ngnix access log lines, adds a custom "user" field with Coralogix as the value, and a multi-line pattern to ensure that multi-line log lines (logs spanning a few lines separated by\norte, similar to Java stack traces) will be merged into single log events. The processors section discards some Filebeat metadata fields we don't care about and the output is set to the console, so we'll see our logs in thestandard outlet.

Example 4

This example uses a Redis slow log entry, which forwards the Redis slow log entries to the Coralogix Logstash server (outbound) with a secure connection after downloading the Coralogix SSL certificate.

Top Articles
Latest Posts
Article information

Author: Delena Feil

Last Updated: 02/17/2023

Views: 5905

Rating: 4.4 / 5 (65 voted)

Reviews: 80% of readers found this page helpful

Author information

Name: Delena Feil

Birthday: 1998-08-29

Address: 747 Lubowitz Run, Sidmouth, HI 90646-5543

Phone: +99513241752844

Job: Design Supervisor

Hobby: Digital arts, Lacemaking, Air sports, Running, Scouting, Shooting, Puzzles

Introduction: My name is Delena Feil, I am a clean, splendid, calm, fancy, jolly, bright, faithful person who loves writing and wants to share my knowledge and understanding with you.