Definitions and Sample Contracts Data
Here we define some of the terms used in this documentation.
- Analytics Hub—is the distribution model within BigQuery. It is the mechanism used to expose a dataset for private repositories. This is how Google distributes the SEC Repository.
- Dataset—is a collection of BigQuery tables. In terms of Law Insider, if you created a private repository, e.g., private.my_contracts48dc the DB Dataset that you would be able to subscribe to in the Analytics Hub will be called private.my_contracts48dc. Then the user can query for all the metadata in documents that were uploaded to the private.my_contracts48dc.
- SEC Repository—is downloaded from Google BigQuery every day. Thus you can query your contracts and SEC financial data in the same table. You can distinguish SEC documents from contracts by querying the column document_is_contract.
Here are some contracts you can download and then upload to a private repository to get started:
https://www.lawinsider.com/tags/country has tons of contracts.
The Stanford contract-nli model dataset also has a bunch of contracts available in pdf format suitable for uploading into a private repo. Click 'download' here at the bottom. https://stanfordnlp.github.io/contract-nli/
Updated 3 months ago