SEC Repository

In the U.S. public companies, certain insiders and broker-dealers are required to regularly file with the SEC. The SEC makes this data available online for anybody to view and use via their Electronic Data Gathering, Analysis, and Retrieval (EDGAR) database. The SEC updates this data every quarter going back to January, 2009.

Every day, Law Insider downloads new documents from the SEC and processes them into the SEC Repository. The added structure to the data present in the documents and contracts provided by Law Insider creates new opportunities to uncover insights and perform novel analysis.

Big Query has also created an SEC Public Dataset that is focused on financial reporting details. Use the sec_company_cik column to join data from the SEC Repository to a company's financial reporting data.

If you use the python-edgar library, you will find join keys for each row in the master index.

Select 10-K documents from CHEVRON that are contracts

SELECT
   CONCAT("https:// www.lawinsider.com / ", repository_id, " / ", doc_id) AS url,
   doc_category,
FROM
   sec.documents 
WHERE
   doc_is_contract = TRUE 
   AND sec_filing_type LIKE '10-K' 
   AND LOWER(sec_company_name) LIKE '%chevron%'
limit 5

Results:

urldoc_category
https:// www.lawinsider.com / sec / fie2y0WVKMfpower of attorney
https:// www.lawinsider.com / sec / b36tj3pwUsxlease
https:// www.lawinsider.com / sec / 9jFdJBzG8eIpower of attorney
https:// www.lawinsider.com / sec / i3kjcww4HRgpower of attorney
https:// www.lawinsider.com / sec / hrj3Drocm39power of attorney

Join Law Insider SEC Repository with the Big Query SEC Public Dataset

Use this to join the SEC data in Law Insider to the sec_quarterly_financials.submissions table in Big Query to get submissions together with company's financial reporting data.

WITH li_docs AS
  (SELECT *
   FROM sec.documents)
SELECT sq.*,
       li_docs.*,
FROM `bigquery-public-data.sec_quarterly_financials.submission` sq
JOIN li_docs ON li_docs.sec_company_cik=sq.central_index_key;

Here are some of the columns in submissions table:

column_name
submission_number
report
file_type
menu_category
shortname
longname
roleuri
parentroleuri
parentreport
ultimate_parent_report
_PARTITIONTIME
submission_number
company_name
measure_tag
version
period_end_date
number_of_quarters
units
dimension_hash

Get Standard Industry Classification (SIC) title for each company

WITH
  li_docs AS (SELECT * FROM sec.documents)
SELECT
  sc.sic_code,
  sc.industry_title,
  li_docs.sec_company_name
FROM
  `bigquery-public-data.sec_quarterly_financials.sic_codes` sc 
JOIN
  li_docs
ON
  CAST(li_docs.sec_company_sic AS STRING)=sc.sic_code
limit 5

Results:

sic_codeindustry_titlesec_company_name
7389SERVICES-BUSINESS SERVICES, NECuVuMobile, Inc
6799INVESTORS, NECADVANCED TECHNOLOGY INDUSTRIES INC
4924NATURAL GAS DISTRIBUTIONPEOPLES ENERGY CORP
4832RADIO BROADCASTING STATIONSCOX RADIO INC
3841SURGICAL & MEDICAL INSTRUMENTS & APPARATUSKENSEY NASH CORP%

Create URL to document hosted at sec.gov

SELECT
   CONCAT("https:// www.sec.gov / Archives / edgar / data / ", sec_company_cik, " / ", REPLACE(sec_filing_id, " - ", ""), " / ", doc_filename) AS url,
   doc_head,
   doc_category,
   sec_company_name,
FROM
   sec.documents
limit 5

Results in Connected Google Sheets.