Skip to main content

lookup (Classic) Search Operator

The lookup (classic) operator maps data in your log messages to meaningful information saved in Sumo or on an HTTPS server. For example, you'd use a lookup operator to map "userID" to a real user's name. Or, you'd use a lookup operator to find deny-listed IP addresses.

In either case, you will point the operator to one of the following:

  • A table of saved data generated by the Save operator.
  • A CSV file hosted on an HTTPS server. Attempting to run the lookup operator against a CSV hosted on an HTTP server will not be loaded in Sumo Logic.
note

This topic has information about the classic version of the lookup operator that works with the classic Lookup Tables feature. For information about the new, more scalable Lookup Tables feature and the new lookup operator that works with it, see Lookup Tables and lookup.

Syntax

lookup <outputColumn-1> [as <field>] [,<outputColumn-2> [as <field>]] from <filePath> on <joinColumn-1> [,<joinColumn-2>]

Where:

  • outputColumn-x is a list of field names in the header of the filePath.

  • filePath is an HTTPS address of a CSV file containing the external relationship table or a table saved to the Sumo Logic file system by the Save operator.

    note

    Basic authentication is supported for CSV files, with the following syntax: <https://USERNAME:PASSWORD@company.com/userTable.cs>

  • joinColumn-x is a list of pairs of field names that define the relationship between values in the log data results with matching values in an external table.

Rules

  • The size limit for the CSV file is 8MB.
  • If using an HTTPS resource, the file must be downloaded within 10 seconds. If the file can't be downloaded in 10 seconds, it is probably too large.
  • Your joinColumn-x need to be of the same data type and are case sensitive. If your search result's field consists of integer data then the field in your external lookup table must also be integer data. You can cast data to a string or numeric value, see Casting Data to a Number or String.
  • If looking up a metadata field you need to either rename the field using the as option in the lookup operation or filter null values immediately after the lookup operation.

Structuring CSV files

Sumo Logic supports HTTPS-hosted lookup CSV files with the following restrictions:

  • The CSV file must contain a header line.

  • The header line can't use special characters.

  • Fields that contain special characters, such as commas, must be enclosed in quotes. Wrapping all fields in your CSV file is worthwhile, though, as it would prevent any upload issues that might occur, should special characters be introduced in the future.

  • No spaces are allowed between quotes and values. For example:

    "id","name","time"
    "1","foo","6-15-12"
    "2","zoo","6-14-12"
    "3","woo","6-13-12"
  • An URL provided to a lookup (classic) operator should be consistent in its use of query escape/unescaped parameters. 

    For example, the following URL is invalid since it is inconsistent in its use of escape characters (%7C versus |):

    http://localhost:5000/test?v=csv&f1=indicator%7CSHA1&f2=indicator|SHA1

    These URLs will be accepted:

    http://localhost:5000/test?v=csv&f1=indicator*|SHA1&f2=indicator|*``SHA1``http``://localhost:5000/test?v=csv&f1=indicator*%7CSHA1&f2=indicator%7C*SHA1

Dashboard limitation

The lookup operator behaves differently when used in live mode versus interactive mode or an interactive search. When used in live mode the lookup operation is done continually to provide real-time results. However, only the most recent data point is looked up in real time, while the previous data points keep their previously looked up result. An interactive search will conduct the lookup operation on all data points when the query is processed. Therefore, when comparing live mode results to interactive results you will likely have differences in your lookup results.

For example, say you are plotting the average price of a stock over the last 30 days.

In live mode, lookup will return the real-time price and retain the previously looked up data points during the 30 day period.

In an interactive search, lookup will only use the real-time stock price to plot over the past 30 days. In this case, you'd have to provide the previous stock prices for the past 30 days.

In other words, in live mode, lookup will use and retain the lookup data at that point in time when it ran. Whereas lookup in an interactive search will only use the data that was available when it ran.

Examples

Type the lookup operator in the Search tab, just as you'd any other operator.

To match the userID string with a users' ID in your CSV, your query could be:

* | parse "name=*, phone number=*," as (name, phone)
| lookup email from https://company.com/userTable.csv on name=userName, phone=cell

where the userTable.csv file includes the following:

"id","userName","email","IP","cell"
"1","Joe","joe@example.com","192.168.1.1","650-123-4567"
"2","John","john@example.com","192.168.1.2","212-123-4567"
"3","Susan","susan@example.com","192.168.1.3","914-123-4567"
"4","John","another_john@example.com",192.168.1.4","408-123-4567"
"5","John","yet_another_john@example.com","192.169.1.5","734-123-4567"

Running this query adds three fields to the output: name, phone, and email.

Composite field lookup

In our example above we had several users named John. A lookup operator can be used on a composite set of fields, so you can identify the correct email for each person named John because each unique cell phone number has also been mapped using a query like:

* | parse "name=*, phone number=*," as (name, phone)
| lookup email from https://company.com/userTable.csv on name=userName, phone=cell

Running this query adds an email field to the output.

Using multiple lookup operators together

Another way to use a lookup operator is to chain lookup operators together. Each operator can call separate CSV files. For example, if you wanted to find user names and the position each user has in a company, your query could be:

* | parse "userID=*," as userID
| lookup userName from https://company.com/userTable.csv on userID=id
| lookup position from https://company.com/userPosition.csv on userID=id

where the userPosition.csv file includes the following:

"id","position"
"1","Salesperso"
"2","Salesperson"
"3","Engineer"
"4","Manager"
"5","Senior Engineer"

In our example above, the first operator finds the name, and the second finds the position.

Handling null values

To find a mismatch from a lookup operator query, use the isNull operator.

For example, running a query like:

| parse "code=*]" as code
| lookup status_code from shared/statusupdates on status = code
| if (isNull(status_code), "unknown", status_code) as status_code

Using lookup to access saved data

Once you've saved the results of a search to the Sumo Logic file system using a Save operator, the lookup operator allows you to search that data.

For example, say we wanted to find the date when users signed up in a file named newDailyUsers (the full path is myFolder/mySubFolder/newDailyUsers). We'd use this query to find that information:

* | parse "user_name=*," as name
| lookup date from myFolder/mySubFolder/newDailyUsers on name=name

A file generated by a save operator can be saved to an org-level shared folder. This allows for others in your organization to use your search results when running their lookup queries. See saving files to a shared location for details.

Duplicate keys error

If the key you specify in a lookup operation matches several records, you get an error message that warns you of the duplication:

Lookup table folder/myfolder/filename has duplicate keys. The last value associated with a duplicated key will be used in the lookup result.

You only get the last associated value as a result. 

For example, if you are searching your Apache Access logs from 34.87.4.6 and you are looking for an internal server errors by a specific keyid, lookup provides the last result that matches your criteria:

lookup-duplicate.png

Legal
Privacy Statement
Terms of Use

Copyright © 2023 by Sumo Logic, Inc.