Skip to main content

dedup Search Operator

The dedup operator removes duplicate results. You have the option to remove consecutively and by specific fields. This allows you to filter your results to identify the most recent or last few events based on an identical combination of results.

For example, to find the most recent value of services you'd use the following operation: | dedup 1 by service.

Supported features

The dedup operator is supported for the following features:

Syntax

dedup [consecutive] [<int>] [by <field>[, <field2>, ...]]
ParameterDescriptionExample
consecutiveRemoves duplicate combinations of values that are in succession.Remove only consecutive duplicate events. Keep non-consecutive duplicate events. In this example, duplicates must have the same combination of values as the source and host fields for them to be removed. Non-consecutive events with the same combination of source and host fields will be retained.
`...
intSpecifies the number of most recent events to return.For search results that have the same source value, keep the first three that occur and remove all subsequent search results.
`...
fieldA comma-separated list of field names to remove duplicate values from. If no fields are specified, the query is run against _raw, the full raw log message.
For example, `
dedupis the same as

Rules

  • Non-aggregate and aggregate queries are supported.
    • non-aggregate queries process up to 100k results.
    • aggregate queries process all results.
  • Use the sort operator before dedup to control the order of removed results.
  • Running dedup against the full raw log message is inefficient and is not recommended.
  • The histogram only shows results the dedup operator returned.

Examples

The following examples use this sample data.

Timestamp CityCountryContinentPopulation (in millions)
05/09/2021 11:32:00Las VegasUSA NorthAmerica 2.31
05/09/2021 11:32:00ParisFrance
05/09/2021 11:30:00KarachiAsia
05/09/2021 11:29:00ChennaiIndiaAsia
05/09/2021 11:28:05MumbaiIndiaAsia
05/09/2021 11:28:00BangaloreIndiaAsia
05/09/2021 11:27:00FloridaUSANorth America
05/09/2021 11:26:00WashingtonUSANorth America
05/09/2021 11:25:00New YorkUSANorth America
05/09/2021 11:24:00San FranciscoUSA North America8.5
05/09/2021 11:23:00DelhiIndiaAsia
05/09/2021 11:22:00KolkataIndiaAsia

Remove duplicate search results by country

| dedup by country

Returns the most recent record for each country:

deup by country

Keep the first 3 duplicate results

For search results that have the same country value, keep the first three that occur and remove all subsequent search results.

| dedup 3 by country

Returns the following results:

deup by 3

Keep results with same combination of values in multiple fields

For search results that have the same country AND continent values, keep the first two search results that occur and remove all subsequent results.

| dedup 2 by country, continent

Returns the following results:

deup by 3

Remove only consecutive duplicate events

Remove only consecutive duplicate events. Keep non-consecutive duplicate events. In this example, duplicates must have the same combination of values as the country and continent fields for them to be removed. Non-consecutive events with the same combination of source and host fields will be retained.

| dedup consecutive by country, continent

Returns the following results:

deup by 3

Legal
Privacy Statement
Terms of Use

Copyright © 2023 by Sumo Logic, Inc.