Query Examples

To get you started, here are some SQL examples.

These examples will work with both the SEC Repository and with a Private Contract Repository. This is because both types of data are put into the same dataset, whose schema is here.

You run queries into the Google Cloud BigQuery console or bq command line.

Aggregate by category

The first example shows document category counts for contracts.

The categories in this result set are the same categories that are used for Law Insider Sample categories.

SELECT doc_category,
       COUNT(doc_category) AS doc_category_count
FROM sec.documents
WHERE doc_is_contract = TRUE
GROUP BY 1
ORDER BY 1 DESC

You can see sample output here:Connected Sheets query results

List clauses that appear in sales agreements

SELECT doc_id AS doc_id,
       clauses.id AS clause_id,
       clauses.title.text AS clause_title,
       clauses.snippet.text AS clause_snippet,
       doc_category AS doc_category,
FROM sec.documents,
     UNNEST (clauses) AS clauses
WHERE TRUE
  AND clauses.title.text IS NOT NULL
  AND clauses.title.text != ''
  AND doc_is_contract = TRUE
  AND LOWER(doc_category) LIKE '%sales agreement%'
ORDER BY doc_id ASC,
         clause_id ASC

You can see sample date here:Connect Sheets query results

Notes on this Query

The clause_id is a counter that starts at 1 within each document. Clauses are listed in clause_id order.

You could use this query to:

  • Find different definitions for the same terms across 100s or 1000s of documents.

  • Uncover hidden risk hidden among multiple versions of a critical term.

Find different definitions for the same terms across your documents

This shows the definition title and snippet. If the same title has more than one snippet then you have different definitions for the same terms.

SELECT definitions.title.text AS definition_title,
       definitions.snippet.text AS definition_snippet,
       COUNT(*) AS definition_count
FROM sec.documents,
     UNNEST (definitions) AS definitions
WHERE TRUE
  AND definitions.title.text IS NOT NULL
  AND definitions.title.text != ''
GROUP BY 1,
         2

You can see sample output here: Connected Sheets query results

Generate URL to document hosted on lawinsider.com

Use this to access documents from a browser.

   SELECT
        CONCAT("https://www.lawinsider.com/",
        repository_id,
        "/",
        doc_id) AS url,
        doc_head,
        doc_category,
        sec_company_name,
        
    FROM
        {repository_id}.documents;