├── Evidence ├── .evidence │ └── customization │ │ ├── .profile.json │ │ └── custom-formatting.json ├── .github │ └── workflows │ │ └── release.yml ├── .gitignore ├── .npmrc ├── .vscode │ └── extensions.json ├── README.md ├── evidence.plugins.yaml ├── package-lock.json ├── package.json ├── pages │ └── index.md └── sources │ └── vc_data │ ├── connection.yaml │ ├── vc_data.sql │ └── vc_database.duckdb ├── Jupyter DuckDB SEC From D Example.ipynb ├── Jupyter DuckDB SEC From D Example.py ├── LICENSE └── README.md /Evidence/.evidence/customization/.profile.json: -------------------------------------------------------------------------------- 1 | {"anonymousId":"758a88ac-8bac-43a1-9b7e-a1a2b4bd31cb","traits":{"projectCreated":"2024-08-14T06:21:41.173Z"}} 2 | -------------------------------------------------------------------------------- /Evidence/.evidence/customization/custom-formatting.json: -------------------------------------------------------------------------------- 1 | { 2 | "version": "1.0", 3 | "customFormats": [] 4 | } -------------------------------------------------------------------------------- /Evidence/.github/workflows/release.yml: -------------------------------------------------------------------------------- 1 | # This installs the dependencies, zips the files, uploads the artifact, and creates a GitHub release. 2 | # This appears to work for Windows, but in macOS it has permissions issues. Linux not tested. 3 | name: Package and Release 4 | 5 | on: 6 | push: 7 | tags: 8 | - '*' 9 | 10 | jobs: 11 | build: 12 | runs-on: ${{ matrix.os }} 13 | strategy: 14 | matrix: 15 | include: 16 | # MacOS 17 | - os: macos-latest 18 | arch: x86_64 19 | node: 18 20 | - os: macos-latest 21 | arch: arm64 22 | node: 18 23 | - os: macos-latest 24 | arch: x86_64 25 | node: 20 26 | - os: macos-latest 27 | arch: arm64 28 | node: 20 29 | # Ubuntu 30 | - os: ubuntu-latest 31 | arch: x86_64 32 | node: 18 33 | - os: ubuntu-latest 34 | arch: arm64 35 | node: 18 36 | - os: ubuntu-latest 37 | arch: x86_64 38 | node: 20 39 | - os: ubuntu-latest 40 | arch: arm64 41 | node: 20 42 | # Windows 43 | - os: windows-latest 44 | arch: x86_64 45 | node: 18 46 | - os: windows-latest 47 | arch: x86_64 48 | node: 20 49 | steps: 50 | - name: Checkout code 51 | uses: actions/checkout@v2 52 | 53 | - name: Set up Node.js ${{ matrix.node }} 54 | uses: actions/setup-node@v2 55 | with: 56 | node-version: ${{ matrix.node }} 57 | 58 | - name: Install dependencies 59 | run: npm install 60 | 61 | - name: Zip files (Windows) 62 | if: runner.os == 'Windows' 63 | run: powershell.exe -Command "Compress-Archive -Path . -DestinationPath evidence-${{ matrix.os }}-${{ matrix.arch }}-node${{ matrix.node }}.zip -Force" 64 | 65 | - name: Zip files (Unix) 66 | if: runner.os != 'Windows' 67 | run: zip -r evidence-${{ matrix.os }}-${{ matrix.arch }}-node${{ matrix.node }}.zip . -x "*.git*" 68 | 69 | - name: Upload artifact 70 | uses: actions/upload-artifact@v2 71 | with: 72 | name: evidence-${{ matrix.os }}-${{ matrix.arch }}-node${{ matrix.node }} 73 | path: evidence-${{ matrix.os }}-${{ matrix.arch }}-node${{ matrix.node }}.zip 74 | 75 | release: 76 | runs-on: ubuntu-latest 77 | needs: build 78 | steps: 79 | - name: Checkout code 80 | uses: actions/checkout@v2 81 | 82 | - name: Download all artifacts 83 | uses: actions/download-artifact@v2 84 | with: 85 | path: . 86 | 87 | - name: Create or Update GitHub Release 88 | uses: ncipollo/release-action@v1 89 | with: 90 | artifacts: '**/evidence-*.zip' 91 | token: ${{ secrets.GH_ACTIONS_RELEASE_TOKEN }} 92 | tag: ${{ github.ref }} 93 | name: Release ${{ github.ref_name }} 94 | body: | 95 | Release for ${{ github.ref_name }} with artifacts. 96 | draft: false 97 | prerelease: false 98 | allowUpdates: true -------------------------------------------------------------------------------- /Evidence/.gitignore: -------------------------------------------------------------------------------- 1 | .evidence/template 2 | .svelte-kit 3 | build 4 | node_modules 5 | .DS_Store 6 | static/data 7 | *.options.yaml 8 | .vscode/settings.json 9 | .env 10 | .evidence/meta 11 | -------------------------------------------------------------------------------- /Evidence/.npmrc: -------------------------------------------------------------------------------- 1 | loglevel=error 2 | audit=false 3 | fund=false 4 | -------------------------------------------------------------------------------- /Evidence/.vscode/extensions.json: -------------------------------------------------------------------------------- 1 | { 2 | "recommendations": [ 3 | "evidence.evidence-vscode" 4 | ] 5 | } -------------------------------------------------------------------------------- /Evidence/README.md: -------------------------------------------------------------------------------- 1 | # Evidence Template Project 2 | 3 | ## Using Codespaces 4 | 5 | If you are using this template in Codespaces, click the `Start Evidence` button in the bottom status bar. This will install dependencies and open a preview of your project in your browser - you should get a popup prompting you to open in browser. 6 | 7 | Or you can use the following commands to get started: 8 | 9 | ```bash 10 | npm install 11 | npm run sources 12 | npm run dev -- --host 0.0.0.0 13 | ``` 14 | 15 | See [the CLI docs](https://docs.evidence.dev/cli/) for more command information. 16 | 17 | **Note:** Codespaces is much faster on the Desktop app. After the Codespace has booted, select the hamburger menu → Open in VS Code Desktop. 18 | 19 | ## Get Started from VS Code 20 | 21 | The easiest way to get started is using the [VS Code Extension](https://marketplace.visualstudio.com/items?itemName=evidence-dev.evidence): 22 | 23 | 1. Install the extension from the VS Code Marketplace 24 | 2. Open the Command Palette (Ctrl/Cmd + Shift + P) and enter `Evidence: New Evidence Project` 25 | 3. Click `Start Evidence` in the bottom status bar 26 | 27 | ## Get Started using the CLI 28 | 29 | ```bash 30 | npx degit evidence-dev/template my-project 31 | cd my-project 32 | npm install 33 | npm run sources 34 | npm run dev 35 | ``` 36 | 37 | Check out the docs for [alternative install methods](https://docs.evidence.dev/getting-started/install-evidence) including Docker, Github Codespaces, and alongside dbt. 38 | 39 | 40 | 41 | ## Learning More 42 | 43 | - [Docs](https://docs.evidence.dev/) 44 | - [Github](https://github.com/evidence-dev/evidence) 45 | - [Slack Community](https://slack.evidence.dev/) 46 | - [Evidence Home Page](https://www.evidence.dev) 47 | -------------------------------------------------------------------------------- /Evidence/evidence.plugins.yaml: -------------------------------------------------------------------------------- 1 | 2 | components: 3 | # This loads all of evidence's core charts and UI components 4 | # You probably don't want to edit this dependency unless you know what you are doing 5 | "@evidence-dev/core-components": {} 6 | 7 | datasources: 8 | # You can add additional datasources here by adding npm packages. 9 | # Make to also add them to `package.json`. 10 | "@evidence-dev/bigquery": { } 11 | "@evidence-dev/csv": { } 12 | "@evidence-dev/databricks": { } 13 | "@evidence-dev/duckdb": { } 14 | "@evidence-dev/mssql": { } 15 | "@evidence-dev/mysql": { } 16 | "@evidence-dev/postgres": { } 17 | "@evidence-dev/snowflake": { } 18 | "@evidence-dev/sqlite": { } 19 | "@evidence-dev/trino": { } 20 | "@evidence-dev/motherduck": { } 21 | -------------------------------------------------------------------------------- /Evidence/package.json: -------------------------------------------------------------------------------- 1 | { 2 | "name": "my-evidence-project", 3 | "version": "0.0.1", 4 | "scripts": { 5 | "build": "evidence build", 6 | "build:strict": "evidence build:strict", 7 | "dev": "evidence dev --open /", 8 | "test": "evidence build", 9 | "sources": "evidence sources", 10 | "preview": "evidence preview" 11 | }, 12 | "engines": { 13 | "npm": ">=7.0.0", 14 | "node": ">=18.0.0" 15 | }, 16 | "type": "module", 17 | "dependencies": { 18 | "@evidence-dev/bigquery": "^2.0.7", 19 | "@evidence-dev/core-components": "^4.7.4", 20 | "@evidence-dev/csv": "^1.0.12", 21 | "@evidence-dev/databricks": "^1.0.7", 22 | "@evidence-dev/duckdb": "^1.0.11", 23 | "@evidence-dev/evidence": "^39.1.3", 24 | "@evidence-dev/motherduck": "^1.0.2", 25 | "@evidence-dev/mssql": "^1.0.9", 26 | "@evidence-dev/mysql": "^1.1.3", 27 | "@evidence-dev/postgres": "^1.0.6", 28 | "@evidence-dev/snowflake": "^1.1.0", 29 | "@evidence-dev/sqlite": "^2.0.6", 30 | "@evidence-dev/trino": "^1.0.8" 31 | }, 32 | "overrides": { 33 | "jsonwebtoken": "9.0.0", 34 | "trim@<0.0.3": ">0.0.3", 35 | "sqlite3": "5.1.5" 36 | } 37 | } 38 | -------------------------------------------------------------------------------- /Evidence/pages/index.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: VC Fundraising 3 | --- 4 | 5 | ```sql vc_data 6 | select 7 | ENTITYNAME, sum(TOTALAMOUNTSOLD) 8 | from vc_data.vc_data 9 | group by ENTITYNAME 10 | order by 2 desc 11 | limit 1000 12 | 13 | ``` 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | ```sql data_raised_by_firm 31 | select 32 | date_trunc('month', normalized_date::DATE) as month, 33 | ENTITYNAME, 34 | sum(TOTALAMOUNTSOLD) as total_raised, 35 | from vc_data.vc_data 36 | where ENTITYNAME like '${inputs.ENTITYNAME.value}' 37 | and date_part('year', normalized_date::DATE) like '${inputs.year.value}' 38 | and INVESTMENTFUNDTYPE = 'Venture Capital Fund' 39 | group by all 40 | order by total_raised desc 41 | limit 10000 42 | ``` 43 | 44 | 51 | 52 | ```sql heatmap 53 | select 54 | normalized_date::DATE as date, 55 | sum(TOTALAMOUNTSOLD) as total_raised, 56 | from vc_data.vc_data 57 | where ENTITYNAME like '${inputs.ENTITYNAME.value}' 58 | and date_part('year', normalized_date::DATE) like '${inputs.year.value}' 59 | and INVESTMENTFUNDTYPE = 'Venture Capital Fund' 60 | group by all 61 | ``` 62 | 63 | 70 | 71 | ```sql datatable 72 | select 73 | normalized_date::DATE as date, 74 | ENTITYNAME, 75 | sum(TOTALAMOUNTSOLD) as total_raised, 76 | from vc_data.vc_data 77 | where ENTITYNAME like '${inputs.ENTITYNAME.value}' 78 | and date_part('year', normalized_date::DATE) like '${inputs.year.value}' 79 | and INVESTMENTFUNDTYPE = 'Venture Capital Fund' 80 | group by all 81 | order by total_raised desc 82 | ``` 83 | 84 | 85 | 86 | 87 | 88 | ` 89 | 90 | 91 | ```sql chart_query 92 | select 93 | ENTITYNAME, 94 | sum(TOTALAMOUNTSOLD) as total_raised, 95 | from vc_data.vc_data 96 | where ENTITYNAME like '${inputs.ENTITYNAME.value}' 97 | and INVESTMENTFUNDTYPE = 'Venture Capital Fund' 98 | group by all 99 | order by total_raised desc 100 | limit 100 101 | ``` 102 | 103 | 110 | -------------------------------------------------------------------------------- /Evidence/sources/vc_data/connection.yaml: -------------------------------------------------------------------------------- 1 | # This file was automatically generated 2 | name: vc_data 3 | type: duckdb 4 | options: 5 | filename: vc_database.duckdb 6 | -------------------------------------------------------------------------------- /Evidence/sources/vc_data/vc_data.sql: -------------------------------------------------------------------------------- 1 | select * from vc_data -------------------------------------------------------------------------------- /Evidence/sources/vc_data/vc_database.duckdb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ethanfinkel/SEC_Form_D_Evidence/8b616f168c0e7785196ab9b8d2dd050cef69be34/Evidence/sources/vc_data/vc_database.duckdb -------------------------------------------------------------------------------- /Jupyter DuckDB SEC From D Example.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": 1, 6 | "id": "63f4be19", 7 | "metadata": {}, 8 | "outputs": [], 9 | "source": [ 10 | "#Imports\n", 11 | "import pandas as pd\n", 12 | "import duckdb as db\n", 13 | "import numpy as np\n", 14 | "from dateutil import parser\n" 15 | ] 16 | }, 17 | { 18 | "cell_type": "code", 19 | "execution_count": 2, 20 | "id": "7774b369", 21 | "metadata": {}, 22 | "outputs": [], 23 | "source": [ 24 | "\"\"\"\n", 25 | "Specify path where form D files are stored. \n", 26 | "These can be downloaded from https://www.sec.gov/data-research/sec-markets-data/form-d-data-sets.\n", 27 | "This notebook is structured to have each folder from the SEC data set stored in the folder Form_D_data. \n", 28 | "\"\"\"\n", 29 | "path = \"../Form_D_data\"" 30 | ] 31 | }, 32 | { 33 | "cell_type": "code", 34 | "execution_count": 3, 35 | "id": "a4030e06", 36 | "metadata": {}, 37 | "outputs": [ 38 | { 39 | "data": { 40 | "text/html": [ 41 | "
\n", 42 | "\n", 55 | "\n", 56 | " \n", 57 | " \n", 58 | " \n", 59 | " \n", 60 | " \n", 61 | " \n", 62 | " \n", 63 | " \n", 64 | " \n", 65 | " \n", 66 | " \n", 67 | " \n", 68 | " \n", 69 | " \n", 70 | " \n", 71 | " \n", 72 | " \n", 73 | " \n", 74 | " \n", 75 | " \n", 76 | " \n", 77 | " \n", 78 | " \n", 79 | " \n", 80 | " \n", 81 | " \n", 82 | " \n", 83 | " \n", 84 | " \n", 85 | " \n", 86 | " \n", 87 | " \n", 88 | " \n", 89 | " \n", 90 | " \n", 91 | " \n", 92 | " \n", 93 | " \n", 94 | " \n", 95 | " \n", 96 | " \n", 97 | " \n", 98 | " \n", 99 | " \n", 100 | " \n", 101 | " \n", 102 | " \n", 103 | " \n", 104 | " \n", 105 | " \n", 106 | " \n", 107 | " \n", 108 | " \n", 109 | " \n", 110 | " \n", 111 | " \n", 112 | " \n", 113 | " \n", 114 | " \n", 115 | " \n", 116 | " \n", 117 | " \n", 118 | " \n", 119 | " \n", 120 | " \n", 121 | " \n", 122 | " \n", 123 | " \n", 124 | " \n", 125 | " \n", 126 | " \n", 127 | " \n", 128 | " \n", 129 | " \n", 130 | " \n", 131 | " \n", 132 | " \n", 133 | " \n", 134 | " \n", 135 | " \n", 136 | " \n", 137 | " \n", 138 | " \n", 139 | " \n", 140 | " \n", 141 | " \n", 142 | " \n", 143 | " \n", 144 | " \n", 145 | " \n", 146 | " \n", 147 | " \n", 148 | " \n", 149 | " \n", 150 | " \n", 151 | " \n", 152 | " \n", 153 | " \n", 154 | " \n", 155 | " \n", 156 | " \n", 157 | " \n", 158 | " \n", 159 | " \n", 160 | " \n", 161 | " \n", 162 | " \n", 163 | " \n", 164 | " \n", 165 | " \n", 166 | " \n", 167 | " \n", 168 | " \n", 169 | " \n", 170 | " \n", 171 | " \n", 172 | " \n", 173 | " \n", 174 | " \n", 175 | " \n", 176 | " \n", 177 | " \n", 178 | " \n", 179 | " \n", 180 | " \n", 181 | " \n", 182 | " \n", 183 | " \n", 184 | " \n", 185 | " \n", 186 | " \n", 187 | " \n", 188 | " \n", 189 | " \n", 190 | " \n", 191 | " \n", 192 | " \n", 193 | " \n", 194 | " \n", 195 | " \n", 196 | " \n", 197 | " \n", 198 | " \n", 199 | " \n", 200 | " \n", 201 | " \n", 202 | " \n", 203 | " \n", 204 | " \n", 205 | " \n", 206 | " \n", 207 | " \n", 208 | " \n", 209 | " \n", 210 | " \n", 211 | " \n", 212 | " \n", 213 | " \n", 214 | " \n", 215 | " \n", 216 | " \n", 217 | " \n", 218 | " \n", 219 | " \n", 220 | " \n", 221 | " \n", 222 | " \n", 223 | " \n", 224 | " \n", 225 | " \n", 226 | " \n", 227 | " \n", 228 | " \n", 229 | " \n", 230 | " \n", 231 | " \n", 232 | " \n", 233 | " \n", 234 | " \n", 235 | " \n", 236 | " \n", 237 | " \n", 238 | " \n", 239 | " \n", 240 | " \n", 241 | " \n", 242 | " \n", 243 | " \n", 244 | " \n", 245 | " \n", 246 | " \n", 247 | " \n", 248 | " \n", 249 | " \n", 250 | " \n", 251 | " \n", 252 | " \n", 253 | " \n", 254 | " \n", 255 | " \n", 256 | " \n", 257 | " \n", 258 | " \n", 259 | " \n", 260 | " \n", 261 | " \n", 262 | " \n", 263 | " \n", 264 | " \n", 265 | " \n", 266 | " \n", 267 | " \n", 268 | " \n", 269 | " \n", 270 | " \n", 271 | " \n", 272 | " \n", 273 | " \n", 274 | " \n", 275 | " \n", 276 | " \n", 277 | " \n", 278 | " \n", 279 | " \n", 280 | " \n", 281 | " \n", 282 | " \n", 283 | " \n", 284 | " \n", 285 | " \n", 286 | " \n", 287 | " \n", 288 | " \n", 289 | " \n", 290 | " \n", 291 | " \n", 292 | " \n", 293 | " \n", 294 | " \n", 295 | " \n", 296 | " \n", 297 | " \n", 298 | " \n", 299 | " \n", 300 | " \n", 301 | " \n", 302 | " \n", 303 | " \n", 304 | " \n", 305 | " \n", 306 | " \n", 307 | " \n", 308 | " \n", 309 | " \n", 310 | " \n", 311 | " \n", 312 | " \n", 313 | " \n", 314 | " \n", 315 | " \n", 316 | " \n", 317 | " \n", 318 | " \n", 319 | " \n", 320 | " \n", 321 | " \n", 322 | " \n", 323 | " \n", 324 | " \n", 325 | " \n", 326 | " \n", 327 | " \n", 328 | " \n", 329 | " \n", 330 | " \n", 331 | " \n", 332 | " \n", 333 | " \n", 334 | " \n", 335 | " \n", 336 | " \n", 337 | " \n", 338 | " \n", 339 | " \n", 340 | " \n", 341 | " \n", 342 | " \n", 343 | " \n", 344 | " \n", 345 | " \n", 346 | " \n", 347 | " \n", 348 | "
ACCESSIONNUMBERINDUSTRYGROUPTYPEINVESTMENTFUNDTYPEIS40ACTREVENUERANGEAGGREGATENETASSETVALUERANGEFEDERALEXEMPTIONS_ITEMS_LISTISAMENDMENTPREVIOUSACCESSIONNUMBERSALE_DATE...TOTALNUMBERALREADYINVESTEDSALESCOMM_DOLLARAMOUNTSALESCOMM_ISESTIMATEFINDERSFEE_DOLLARAMOUNTFINDERSFEE_ISESTIMATEFINDERFEECLARIFICATIONOFRESPGROSSPROCEEDSUSED_DOLLARAMOUNTGROSSPROCEEDSUSED_ISESTIMATEGROSSPROCEEDSUSED_CLAROFRESPAUTHORIZEDREPRESENTATIVE
00001477399-09-000001Investment BankingNoneNone$5,000,001 - $25,000,000None06FalseNoneNone...00None0NoneNone0NoneNoneFalse
10001192482-09-000126Pooled Investment FundPrivate Equity FundfalseDecline to DiscloseNone06, 3C, 3C.7FalseNone2008-05-15...60None0NoneFinancial advisers of individual investors may...0NoneNoneFalse
20001446406-08-000001RestaurantsNoneNoneDecline to DiscloseNone06FalseNone2007-09-21...50None0NoneNone0NoneNoneTrue
30001446409-08-000001RestaurantsNoneNoneNo RevenuesNone06FalseNone2008-09-30...20None0NoneNone0NoneNoneTrue
40001446405-08-000001RestaurantsNoneNone$5,000,001 - $25,000,000None06FalseNone2004-01-26...410None0NoneNone0NoneNoneTrue
..................................................................
6472830001062993-23-023297OtherNoneNoneNot ApplicableNone06bFalseNone2023-12-22...10None0NoneNone0NoneNoneFalse
6472840002005968-23-000002Pooled Investment FundPrivate Equity FundfalseDecline to DiscloseNone06b, 3C, 3C.7FalseNoneNone...00None0NoneNone0NoneNoneFalse
6472850001750134-23-000001Pooled Investment FundHedge FundfalseNoneDecline to Disclose06cTrue0001750134-22-0000012018-08-03...5780None0NoneNone0NoneNoneFalse
6472860002005978-23-000002Pooled Investment FundPrivate Equity FundfalseDecline to DiscloseNone06b, 3C, 3C.7FalseNoneNone...00None0NoneNone0NoneNoneFalse
6472870002006335-23-000001Pooled Investment FundOther Investment FundfalseNoneDecline to Disclose06b, 3C, 3C.9FalseNoneNone...00None0NoneNone0NoneThe Investment Manager receives a management f...False
\n", 349 | "

647288 rows × 41 columns

\n", 350 | "
" 351 | ], 352 | "text/plain": [ 353 | " ACCESSIONNUMBER INDUSTRYGROUPTYPE INVESTMENTFUNDTYPE \\\n", 354 | "0 0001477399-09-000001 Investment Banking None \n", 355 | "1 0001192482-09-000126 Pooled Investment Fund Private Equity Fund \n", 356 | "2 0001446406-08-000001 Restaurants None \n", 357 | "3 0001446409-08-000001 Restaurants None \n", 358 | "4 0001446405-08-000001 Restaurants None \n", 359 | "... ... ... ... \n", 360 | "647283 0001062993-23-023297 Other None \n", 361 | "647284 0002005968-23-000002 Pooled Investment Fund Private Equity Fund \n", 362 | "647285 0001750134-23-000001 Pooled Investment Fund Hedge Fund \n", 363 | "647286 0002005978-23-000002 Pooled Investment Fund Private Equity Fund \n", 364 | "647287 0002006335-23-000001 Pooled Investment Fund Other Investment Fund \n", 365 | "\n", 366 | " IS40ACT REVENUERANGE AGGREGATENETASSETVALUERANGE \\\n", 367 | "0 None $5,000,001 - $25,000,000 None \n", 368 | "1 false Decline to Disclose None \n", 369 | "2 None Decline to Disclose None \n", 370 | "3 None No Revenues None \n", 371 | "4 None $5,000,001 - $25,000,000 None \n", 372 | "... ... ... ... \n", 373 | "647283 None Not Applicable None \n", 374 | "647284 false Decline to Disclose None \n", 375 | "647285 false None Decline to Disclose \n", 376 | "647286 false Decline to Disclose None \n", 377 | "647287 false None Decline to Disclose \n", 378 | "\n", 379 | " FEDERALEXEMPTIONS_ITEMS_LIST ISAMENDMENT PREVIOUSACCESSIONNUMBER \\\n", 380 | "0 06 False None \n", 381 | "1 06, 3C, 3C.7 False None \n", 382 | "2 06 False None \n", 383 | "3 06 False None \n", 384 | "4 06 False None \n", 385 | "... ... ... ... \n", 386 | "647283 06b False None \n", 387 | "647284 06b, 3C, 3C.7 False None \n", 388 | "647285 06c True 0001750134-22-000001 \n", 389 | "647286 06b, 3C, 3C.7 False None \n", 390 | "647287 06b, 3C, 3C.9 False None \n", 391 | "\n", 392 | " SALE_DATE ... TOTALNUMBERALREADYINVESTED SALESCOMM_DOLLARAMOUNT \\\n", 393 | "0 None ... 0 0 \n", 394 | "1 2008-05-15 ... 6 0 \n", 395 | "2 2007-09-21 ... 5 0 \n", 396 | "3 2008-09-30 ... 2 0 \n", 397 | "4 2004-01-26 ... 41 0 \n", 398 | "... ... ... ... ... \n", 399 | "647283 2023-12-22 ... 1 0 \n", 400 | "647284 None ... 0 0 \n", 401 | "647285 2018-08-03 ... 578 0 \n", 402 | "647286 None ... 0 0 \n", 403 | "647287 None ... 0 0 \n", 404 | "\n", 405 | " SALESCOMM_ISESTIMATE FINDERSFEE_DOLLARAMOUNT FINDERSFEE_ISESTIMATE \\\n", 406 | "0 None 0 None \n", 407 | "1 None 0 None \n", 408 | "2 None 0 None \n", 409 | "3 None 0 None \n", 410 | "4 None 0 None \n", 411 | "... ... ... ... \n", 412 | "647283 None 0 None \n", 413 | "647284 None 0 None \n", 414 | "647285 None 0 None \n", 415 | "647286 None 0 None \n", 416 | "647287 None 0 None \n", 417 | "\n", 418 | " FINDERFEECLARIFICATIONOFRESP \\\n", 419 | "0 None \n", 420 | "1 Financial advisers of individual investors may... \n", 421 | "2 None \n", 422 | "3 None \n", 423 | "4 None \n", 424 | "... ... \n", 425 | "647283 None \n", 426 | "647284 None \n", 427 | "647285 None \n", 428 | "647286 None \n", 429 | "647287 None \n", 430 | "\n", 431 | " GROSSPROCEEDSUSED_DOLLARAMOUNT GROSSPROCEEDSUSED_ISESTIMATE \\\n", 432 | "0 0 None \n", 433 | "1 0 None \n", 434 | "2 0 None \n", 435 | "3 0 None \n", 436 | "4 0 None \n", 437 | "... ... ... \n", 438 | "647283 0 None \n", 439 | "647284 0 None \n", 440 | "647285 0 None \n", 441 | "647286 0 None \n", 442 | "647287 0 None \n", 443 | "\n", 444 | " GROSSPROCEEDSUSED_CLAROFRESP \\\n", 445 | "0 None \n", 446 | "1 None \n", 447 | "2 None \n", 448 | "3 None \n", 449 | "4 None \n", 450 | "... ... \n", 451 | "647283 None \n", 452 | "647284 None \n", 453 | "647285 None \n", 454 | "647286 None \n", 455 | "647287 The Investment Manager receives a management f... \n", 456 | "\n", 457 | " AUTHORIZEDREPRESENTATIVE \n", 458 | "0 False \n", 459 | "1 False \n", 460 | "2 True \n", 461 | "3 True \n", 462 | "4 True \n", 463 | "... ... \n", 464 | "647283 False \n", 465 | "647284 False \n", 466 | "647285 False \n", 467 | "647286 False \n", 468 | "647287 False \n", 469 | "\n", 470 | "[647288 rows x 41 columns]" 471 | ] 472 | }, 473 | "execution_count": 3, 474 | "metadata": {}, 475 | "output_type": "execute_result" 476 | } 477 | ], 478 | "source": [ 479 | "#Load the offerings TSVs from every folder in the Form D directory\n", 480 | "offerings = db.read_csv(f\"{path}/*/OFFERING.tsv\", dtype={'TOTALOFFERINGAMOUNT': 'VARCHAR','TOTALREMAINING': 'VARCHAR'}).df()\n", 481 | "offerings" 482 | ] 483 | }, 484 | { 485 | "cell_type": "code", 486 | "execution_count": 4, 487 | "id": "e95e5bc1", 488 | "metadata": {}, 489 | "outputs": [ 490 | { 491 | "data": { 492 | "text/html": [ 493 | "
\n", 494 | "\n", 507 | "\n", 508 | " \n", 509 | " \n", 510 | " \n", 511 | " \n", 512 | " \n", 513 | " \n", 514 | " \n", 515 | " \n", 516 | " \n", 517 | " \n", 518 | " \n", 519 | " \n", 520 | " \n", 521 | " \n", 522 | " \n", 523 | " \n", 524 | " \n", 525 | " \n", 526 | " \n", 527 | " \n", 528 | " \n", 529 | " \n", 530 | " \n", 531 | " \n", 532 | " \n", 533 | " \n", 534 | " \n", 535 | " \n", 536 | " \n", 537 | " \n", 538 | " \n", 539 | " \n", 540 | " \n", 541 | " \n", 542 | " \n", 543 | " \n", 544 | " \n", 545 | " \n", 546 | " \n", 547 | " \n", 548 | " \n", 549 | " \n", 550 | " \n", 551 | " \n", 552 | " \n", 553 | " \n", 554 | " \n", 555 | " \n", 556 | " \n", 557 | " \n", 558 | " \n", 559 | " \n", 560 | " \n", 561 | " \n", 562 | " \n", 563 | " \n", 564 | " \n", 565 | " \n", 566 | " \n", 567 | " \n", 568 | " \n", 569 | " \n", 570 | " \n", 571 | " \n", 572 | " \n", 573 | " \n", 574 | " \n", 575 | " \n", 576 | " \n", 577 | " \n", 578 | " \n", 579 | " \n", 580 | " \n", 581 | " \n", 582 | " \n", 583 | " \n", 584 | " \n", 585 | " \n", 586 | " \n", 587 | " \n", 588 | " \n", 589 | " \n", 590 | " \n", 591 | " \n", 592 | " \n", 593 | " \n", 594 | " \n", 595 | " \n", 596 | " \n", 597 | " \n", 598 | " \n", 599 | " \n", 600 | " \n", 601 | " \n", 602 | " \n", 603 | " \n", 604 | " \n", 605 | " \n", 606 | " \n", 607 | " \n", 608 | " \n", 609 | " \n", 610 | " \n", 611 | " \n", 612 | " \n", 613 | " \n", 614 | " \n", 615 | " \n", 616 | " \n", 617 | " \n", 618 | " \n", 619 | " \n", 620 | " \n", 621 | " \n", 622 | " \n", 623 | " \n", 624 | " \n", 625 | " \n", 626 | " \n", 627 | " \n", 628 | " \n", 629 | " \n", 630 | " \n", 631 | " \n", 632 | " \n", 633 | " \n", 634 | " \n", 635 | " \n", 636 | " \n", 637 | " \n", 638 | " \n", 639 | " \n", 640 | " \n", 641 | " \n", 642 | " \n", 643 | " \n", 644 | " \n", 645 | " \n", 646 | " \n", 647 | " \n", 648 | " \n", 649 | " \n", 650 | " \n", 651 | " \n", 652 | " \n", 653 | " \n", 654 | " \n", 655 | " \n", 656 | " \n", 657 | " \n", 658 | " \n", 659 | " \n", 660 | " \n", 661 | " \n", 662 | " \n", 663 | " \n", 664 | " \n", 665 | " \n", 666 | " \n", 667 | " \n", 668 | " \n", 669 | " \n", 670 | " \n", 671 | " \n", 672 | " \n", 673 | " \n", 674 | " \n", 675 | " \n", 676 | " \n", 677 | " \n", 678 | " \n", 679 | " \n", 680 | " \n", 681 | " \n", 682 | " \n", 683 | " \n", 684 | " \n", 685 | " \n", 686 | " \n", 687 | " \n", 688 | " \n", 689 | " \n", 690 | " \n", 691 | " \n", 692 | " \n", 693 | " \n", 694 | " \n", 695 | " \n", 696 | " \n", 697 | " \n", 698 | " \n", 699 | " \n", 700 | " \n", 701 | " \n", 702 | " \n", 703 | " \n", 704 | " \n", 705 | " \n", 706 | " \n", 707 | " \n", 708 | " \n", 709 | " \n", 710 | " \n", 711 | " \n", 712 | " \n", 713 | " \n", 714 | " \n", 715 | " \n", 716 | " \n", 717 | " \n", 718 | " \n", 719 | " \n", 720 | " \n", 721 | " \n", 722 | " \n", 723 | " \n", 724 | " \n", 725 | " \n", 726 | " \n", 727 | " \n", 728 | " \n", 729 | " \n", 730 | " \n", 731 | " \n", 732 | " \n", 733 | " \n", 734 | " \n", 735 | " \n", 736 | " \n", 737 | " \n", 738 | " \n", 739 | " \n", 740 | " \n", 741 | " \n", 742 | " \n", 743 | " \n", 744 | " \n", 745 | " \n", 746 | " \n", 747 | " \n", 748 | " \n", 749 | " \n", 750 | " \n", 751 | " \n", 752 | " \n", 753 | " \n", 754 | " \n", 755 | " \n", 756 | " \n", 757 | " \n", 758 | " \n", 759 | " \n", 760 | " \n", 761 | " \n", 762 | " \n", 763 | " \n", 764 | " \n", 765 | " \n", 766 | " \n", 767 | " \n", 768 | " \n", 769 | " \n", 770 | " \n", 771 | " \n", 772 | " \n", 773 | " \n", 774 | " \n", 775 | " \n", 776 | " \n", 777 | " \n", 778 | " \n", 779 | " \n", 780 | " \n", 781 | " \n", 782 | " \n", 783 | " \n", 784 | " \n", 785 | " \n", 786 | " \n", 787 | " \n", 788 | " \n", 789 | " \n", 790 | " \n", 791 | " \n", 792 | " \n", 793 | " \n", 794 | " \n", 795 | " \n", 796 | " \n", 797 | " \n", 798 | " \n", 799 | " \n", 800 | "
ACCESSIONNUMBERIS_PRIMARYISSUER_FLAGISSUER_SEQ_KEYCIKENTITYNAMESTREET1STREET2CITYSTATEORCOUNTRYSTATEORCOUNTRYDESCRIPTION...ISSUER_PREVIOUSNAME_1ISSUER_PREVIOUSNAME_2ISSUER_PREVIOUSNAME_3EDGAR_PREVIOUSNAME_1EDGAR_PREVIOUSNAME_2EDGAR_PREVIOUSNAME_3ENTITYTYPEENTITYTYPEOTHERDESCYEAROFINC_TIMESPAN_CHOICEYEAROFINC_VALUE_ENTERED
00001477399-09-000001YES1010001477399CSA Biotechnology Fund II, LLCTwo Metroplex DriveSuite 111BIRMINGHAMALNone...NoneNoneNoneNoneNoneNoneLimited Liability CompanyNonewithinFiveYears2007.0
10001192482-09-000126YES1010001461727Oppenheimer Global Resource Private Equity Off...C/O OPPENHEIMER ASSET MANAGEMENT LLC125 BROAD STREETNEW YORKNYNone...NoneNoneNoneNoneNoneNoneLimited PartnershipNonewithinFiveYears2007.0
20001446406-08-000001YES1010001446406Dolce Group Huntsville LLC365 THE BRIDGE STREET NWSUITE 125HUNTSVILLEALNone...NoneNoneNoneNoneNoneNoneLimited Liability CompanyNonewithinFiveYears2006.0
30001446409-08-000001YES1010001446409Dolce Group Hollywood Vine LLC1708 VINE STREETNoneHOLLYWOODCANone...NoneNoneNoneNoneNoneNoneLimited Liability CompanyNonewithinFiveYears2008.0
40001446405-08-000001YES1010001446405Geisha House LLC6633 HOLLYWOOD BOULEVARDNoneHOLLYWOODCANone...NoneNoneNoneNoneNoneNoneLimited Liability CompanyNonewithinFiveYears2004.0
..................................................................
6572320002014428-24-000001YES1010002014428Lower Tier WA13 Holdings, LP750 PARK OF COMMERCE DRIVESUITE 210BOCA RATONFLFLORIDA...NoneNoneNoneNoneNoneNoneLimited PartnershipNonewithinFiveYears2024.0
6572330001659341-24-000006YES1010001869503NEPC Global Equity Seriesc/o Global Trust Company12 Gill Street, Suite 2600WoburnMAMASSACHUSETTS...NoneNoneNoneNoneNoneNoneOtherThe issuer is a protected series of NEPC Inves...withinFiveYears2021.0
6572340002017325-24-000001YES1010002017325Anthems Media, Inc.2232 NORTH POINT STREET #3NoneSAN FRANCISCOCACALIFORNIA...NoneNoneNoneNoneNoneNoneCorporationNoneoverFiveYearsNaN
6572350001659341-24-000004YES1010001649416Mercer Diocese of Brooklyn Risk Reduction Stra...c/o Global Trust Company12 Gill Street, Suite 2600WOBURNMAMASSACHUSETTS...NoneNoneNoneNoneNoneNoneOtherMercer Diocese of Brooklyn Risk Reduction Stra...overFiveYearsNaN
6572360001915864-24-000001YES1010001915864SYMPHONY42 Corp11400 BARLEY FIELDS WAYNoneMARRIOTTSVILLEMDMARYLAND...NoneNoneNoneNoneNoneNoneCorporationNonewithinFiveYears2021.0
\n", 801 | "

657237 rows × 23 columns

\n", 802 | "
" 803 | ], 804 | "text/plain": [ 805 | " ACCESSIONNUMBER IS_PRIMARYISSUER_FLAG ISSUER_SEQ_KEY \\\n", 806 | "0 0001477399-09-000001 YES 101 \n", 807 | "1 0001192482-09-000126 YES 101 \n", 808 | "2 0001446406-08-000001 YES 101 \n", 809 | "3 0001446409-08-000001 YES 101 \n", 810 | "4 0001446405-08-000001 YES 101 \n", 811 | "... ... ... ... \n", 812 | "657232 0002014428-24-000001 YES 101 \n", 813 | "657233 0001659341-24-000006 YES 101 \n", 814 | "657234 0002017325-24-000001 YES 101 \n", 815 | "657235 0001659341-24-000004 YES 101 \n", 816 | "657236 0001915864-24-000001 YES 101 \n", 817 | "\n", 818 | " CIK ENTITYNAME \\\n", 819 | "0 0001477399 CSA Biotechnology Fund II, LLC \n", 820 | "1 0001461727 Oppenheimer Global Resource Private Equity Off... \n", 821 | "2 0001446406 Dolce Group Huntsville LLC \n", 822 | "3 0001446409 Dolce Group Hollywood Vine LLC \n", 823 | "4 0001446405 Geisha House LLC \n", 824 | "... ... ... \n", 825 | "657232 0002014428 Lower Tier WA13 Holdings, LP \n", 826 | "657233 0001869503 NEPC Global Equity Series \n", 827 | "657234 0002017325 Anthems Media, Inc. \n", 828 | "657235 0001649416 Mercer Diocese of Brooklyn Risk Reduction Stra... \n", 829 | "657236 0001915864 SYMPHONY42 Corp \n", 830 | "\n", 831 | " STREET1 STREET2 \\\n", 832 | "0 Two Metroplex Drive Suite 111 \n", 833 | "1 C/O OPPENHEIMER ASSET MANAGEMENT LLC 125 BROAD STREET \n", 834 | "2 365 THE BRIDGE STREET NW SUITE 125 \n", 835 | "3 1708 VINE STREET None \n", 836 | "4 6633 HOLLYWOOD BOULEVARD None \n", 837 | "... ... ... \n", 838 | "657232 750 PARK OF COMMERCE DRIVE SUITE 210 \n", 839 | "657233 c/o Global Trust Company 12 Gill Street, Suite 2600 \n", 840 | "657234 2232 NORTH POINT STREET #3 None \n", 841 | "657235 c/o Global Trust Company 12 Gill Street, Suite 2600 \n", 842 | "657236 11400 BARLEY FIELDS WAY None \n", 843 | "\n", 844 | " CITY STATEORCOUNTRY STATEORCOUNTRYDESCRIPTION ... \\\n", 845 | "0 BIRMINGHAM AL None ... \n", 846 | "1 NEW YORK NY None ... \n", 847 | "2 HUNTSVILLE AL None ... \n", 848 | "3 HOLLYWOOD CA None ... \n", 849 | "4 HOLLYWOOD CA None ... \n", 850 | "... ... ... ... ... \n", 851 | "657232 BOCA RATON FL FLORIDA ... \n", 852 | "657233 Woburn MA MASSACHUSETTS ... \n", 853 | "657234 SAN FRANCISCO CA CALIFORNIA ... \n", 854 | "657235 WOBURN MA MASSACHUSETTS ... \n", 855 | "657236 MARRIOTTSVILLE MD MARYLAND ... \n", 856 | "\n", 857 | " ISSUER_PREVIOUSNAME_1 ISSUER_PREVIOUSNAME_2 ISSUER_PREVIOUSNAME_3 \\\n", 858 | "0 None None None \n", 859 | "1 None None None \n", 860 | "2 None None None \n", 861 | "3 None None None \n", 862 | "4 None None None \n", 863 | "... ... ... ... \n", 864 | "657232 None None None \n", 865 | "657233 None None None \n", 866 | "657234 None None None \n", 867 | "657235 None None None \n", 868 | "657236 None None None \n", 869 | "\n", 870 | " EDGAR_PREVIOUSNAME_1 EDGAR_PREVIOUSNAME_2 EDGAR_PREVIOUSNAME_3 \\\n", 871 | "0 None None None \n", 872 | "1 None None None \n", 873 | "2 None None None \n", 874 | "3 None None None \n", 875 | "4 None None None \n", 876 | "... ... ... ... \n", 877 | "657232 None None None \n", 878 | "657233 None None None \n", 879 | "657234 None None None \n", 880 | "657235 None None None \n", 881 | "657236 None None None \n", 882 | "\n", 883 | " ENTITYTYPE \\\n", 884 | "0 Limited Liability Company \n", 885 | "1 Limited Partnership \n", 886 | "2 Limited Liability Company \n", 887 | "3 Limited Liability Company \n", 888 | "4 Limited Liability Company \n", 889 | "... ... \n", 890 | "657232 Limited Partnership \n", 891 | "657233 Other \n", 892 | "657234 Corporation \n", 893 | "657235 Other \n", 894 | "657236 Corporation \n", 895 | "\n", 896 | " ENTITYTYPEOTHERDESC \\\n", 897 | "0 None \n", 898 | "1 None \n", 899 | "2 None \n", 900 | "3 None \n", 901 | "4 None \n", 902 | "... ... \n", 903 | "657232 None \n", 904 | "657233 The issuer is a protected series of NEPC Inves... \n", 905 | "657234 None \n", 906 | "657235 Mercer Diocese of Brooklyn Risk Reduction Stra... \n", 907 | "657236 None \n", 908 | "\n", 909 | " YEAROFINC_TIMESPAN_CHOICE YEAROFINC_VALUE_ENTERED \n", 910 | "0 withinFiveYears 2007.0 \n", 911 | "1 withinFiveYears 2007.0 \n", 912 | "2 withinFiveYears 2006.0 \n", 913 | "3 withinFiveYears 2008.0 \n", 914 | "4 withinFiveYears 2004.0 \n", 915 | "... ... ... \n", 916 | "657232 withinFiveYears 2024.0 \n", 917 | "657233 withinFiveYears 2021.0 \n", 918 | "657234 overFiveYears NaN \n", 919 | "657235 overFiveYears NaN \n", 920 | "657236 withinFiveYears 2021.0 \n", 921 | "\n", 922 | "[657237 rows x 23 columns]" 923 | ] 924 | }, 925 | "execution_count": 4, 926 | "metadata": {}, 927 | "output_type": "execute_result" 928 | } 929 | ], 930 | "source": [ 931 | "#Load the issuers TSVs from every folder in the Form D directory\n", 932 | "issuers = db.read_csv(f\"{path}/*/ISSUERS.tsv\", dtype={'ISSUERPHONENUMBER':'VARCHAR','ZIPCODE':'VARCHAR'}).df()\n", 933 | "issuers" 934 | ] 935 | }, 936 | { 937 | "cell_type": "code", 938 | "execution_count": 5, 939 | "id": "26f87842", 940 | "metadata": {}, 941 | "outputs": [ 942 | { 943 | "data": { 944 | "text/html": [ 945 | "
\n", 946 | "\n", 959 | "\n", 960 | " \n", 961 | " \n", 962 | " \n", 963 | " \n", 964 | " \n", 965 | " \n", 966 | " \n", 967 | " \n", 968 | " \n", 969 | " \n", 970 | " \n", 971 | " \n", 972 | " \n", 973 | " \n", 974 | " \n", 975 | " \n", 976 | " \n", 977 | " \n", 978 | " \n", 979 | " \n", 980 | " \n", 981 | " \n", 982 | " \n", 983 | " \n", 984 | " \n", 985 | " \n", 986 | " \n", 987 | " \n", 988 | " \n", 989 | " \n", 990 | " \n", 991 | " \n", 992 | " \n", 993 | " \n", 994 | " \n", 995 | " \n", 996 | " \n", 997 | " \n", 998 | " \n", 999 | " \n", 1000 | " \n", 1001 | " \n", 1002 | " \n", 1003 | " \n", 1004 | " \n", 1005 | " \n", 1006 | " \n", 1007 | " \n", 1008 | " \n", 1009 | " \n", 1010 | " \n", 1011 | " \n", 1012 | " \n", 1013 | " \n", 1014 | " \n", 1015 | " \n", 1016 | " \n", 1017 | " \n", 1018 | " \n", 1019 | " \n", 1020 | " \n", 1021 | " \n", 1022 | " \n", 1023 | " \n", 1024 | " \n", 1025 | " \n", 1026 | " \n", 1027 | " \n", 1028 | " \n", 1029 | " \n", 1030 | " \n", 1031 | " \n", 1032 | " \n", 1033 | " \n", 1034 | " \n", 1035 | " \n", 1036 | " \n", 1037 | " \n", 1038 | " \n", 1039 | " \n", 1040 | " \n", 1041 | " \n", 1042 | " \n", 1043 | " \n", 1044 | " \n", 1045 | " \n", 1046 | " \n", 1047 | " \n", 1048 | " \n", 1049 | " \n", 1050 | " \n", 1051 | " \n", 1052 | " \n", 1053 | " \n", 1054 | " \n", 1055 | " \n", 1056 | " \n", 1057 | " \n", 1058 | " \n", 1059 | " \n", 1060 | " \n", 1061 | " \n", 1062 | " \n", 1063 | " \n", 1064 | " \n", 1065 | " \n", 1066 | " \n", 1067 | " \n", 1068 | " \n", 1069 | " \n", 1070 | " \n", 1071 | " \n", 1072 | " \n", 1073 | " \n", 1074 | " \n", 1075 | " \n", 1076 | " \n", 1077 | " \n", 1078 | " \n", 1079 | " \n", 1080 | " \n", 1081 | " \n", 1082 | " \n", 1083 | " \n", 1084 | " \n", 1085 | " \n", 1086 | " \n", 1087 | " \n", 1088 | " \n", 1089 | " \n", 1090 | " \n", 1091 | " \n", 1092 | " \n", 1093 | " \n", 1094 | " \n", 1095 | " \n", 1096 | " \n", 1097 | " \n", 1098 | " \n", 1099 | " \n", 1100 | " \n", 1101 | " \n", 1102 | " \n", 1103 | " \n", 1104 | " \n", 1105 | " \n", 1106 | " \n", 1107 | " \n", 1108 | "
ACCESSIONNUMBERFILE_NUMFILING_DATESIC_CODESCHEMAVERSIONSUBMISSIONTYPETESTORLIVEOVER100PERSONSFLAGOVER100ISSUERFLAG
00001477399-09-000001021-1363472008-01-02 06:01:00NoneX0603DLIVENoneNone
10001192482-09-000126021-1314152008-05-30 06:01:00NoneX0301DLIVENoneNone
20001446406-08-000001021-1230212008-09-30 17:24:23NoneX0101DLIVENoneNone
30001446409-08-000001021-1230202008-09-30 16:51:44NoneX0101DLIVENoneNone
40001446405-08-000001021-1230192008-09-30 16:33:23NoneX0101DLIVENoneNone
..............................
6472830002005968-23-000002021-50123602-JAN-2024NoneX0708DLIVENoneNone
6472840001062993-23-023297021-50123502-JAN-2024NoneX0708DLIVENoneNone
6472850002003898-23-000003021-50122902-JAN-2024NoneX0708DLIVENoneNone
6472860002003898-23-000001021-50122302-JAN-2024NoneX0708DLIVENoneNone
6472870002006335-23-000001021-50122402-JAN-2024NoneX0708DLIVENoneNone
\n", 1109 | "

647288 rows × 9 columns

\n", 1110 | "
" 1111 | ], 1112 | "text/plain": [ 1113 | " ACCESSIONNUMBER FILE_NUM FILING_DATE \\\n", 1114 | "0 0001477399-09-000001 021-136347 2008-01-02 06:01:00 \n", 1115 | "1 0001192482-09-000126 021-131415 2008-05-30 06:01:00 \n", 1116 | "2 0001446406-08-000001 021-123021 2008-09-30 17:24:23 \n", 1117 | "3 0001446409-08-000001 021-123020 2008-09-30 16:51:44 \n", 1118 | "4 0001446405-08-000001 021-123019 2008-09-30 16:33:23 \n", 1119 | "... ... ... ... \n", 1120 | "647283 0002005968-23-000002 021-501236 02-JAN-2024 \n", 1121 | "647284 0001062993-23-023297 021-501235 02-JAN-2024 \n", 1122 | "647285 0002003898-23-000003 021-501229 02-JAN-2024 \n", 1123 | "647286 0002003898-23-000001 021-501223 02-JAN-2024 \n", 1124 | "647287 0002006335-23-000001 021-501224 02-JAN-2024 \n", 1125 | "\n", 1126 | " SIC_CODE SCHEMAVERSION SUBMISSIONTYPE TESTORLIVE OVER100PERSONSFLAG \\\n", 1127 | "0 None X0603 D LIVE None \n", 1128 | "1 None X0301 D LIVE None \n", 1129 | "2 None X0101 D LIVE None \n", 1130 | "3 None X0101 D LIVE None \n", 1131 | "4 None X0101 D LIVE None \n", 1132 | "... ... ... ... ... ... \n", 1133 | "647283 None X0708 D LIVE None \n", 1134 | "647284 None X0708 D LIVE None \n", 1135 | "647285 None X0708 D LIVE None \n", 1136 | "647286 None X0708 D LIVE None \n", 1137 | "647287 None X0708 D LIVE None \n", 1138 | "\n", 1139 | " OVER100ISSUERFLAG \n", 1140 | "0 None \n", 1141 | "1 None \n", 1142 | "2 None \n", 1143 | "3 None \n", 1144 | "4 None \n", 1145 | "... ... \n", 1146 | "647283 None \n", 1147 | "647284 None \n", 1148 | "647285 None \n", 1149 | "647286 None \n", 1150 | "647287 None \n", 1151 | "\n", 1152 | "[647288 rows x 9 columns]" 1153 | ] 1154 | }, 1155 | "execution_count": 5, 1156 | "metadata": {}, 1157 | "output_type": "execute_result" 1158 | } 1159 | ], 1160 | "source": [ 1161 | "#Load the form_d_submissions TSVs from every folder in the Form D directory\n", 1162 | "form_d_submission = db.read_csv(f\"{path}/*/FORMDSUBMISSION.tsv\", dtype={'FILING_DATE':'VARCHAR'}).df()\n", 1163 | "form_d_submission" 1164 | ] 1165 | }, 1166 | { 1167 | "cell_type": "code", 1168 | "execution_count": 6, 1169 | "id": "d86d70ed", 1170 | "metadata": {}, 1171 | "outputs": [ 1172 | { 1173 | "data": { 1174 | "text/html": [ 1175 | "
\n", 1176 | "\n", 1189 | "\n", 1190 | " \n", 1191 | " \n", 1192 | " \n", 1193 | " \n", 1194 | " \n", 1195 | " \n", 1196 | " \n", 1197 | " \n", 1198 | " \n", 1199 | " \n", 1200 | " \n", 1201 | " \n", 1202 | " \n", 1203 | " \n", 1204 | " \n", 1205 | " \n", 1206 | " \n", 1207 | " \n", 1208 | " \n", 1209 | " \n", 1210 | " \n", 1211 | " \n", 1212 | " \n", 1213 | " \n", 1214 | " \n", 1215 | " \n", 1216 | " \n", 1217 | " \n", 1218 | " \n", 1219 | " \n", 1220 | " \n", 1221 | " \n", 1222 | " \n", 1223 | " \n", 1224 | " \n", 1225 | " \n", 1226 | " \n", 1227 | " \n", 1228 | " \n", 1229 | " \n", 1230 | " \n", 1231 | " \n", 1232 | " \n", 1233 | " \n", 1234 | " \n", 1235 | " \n", 1236 | " \n", 1237 | " \n", 1238 | " \n", 1239 | " \n", 1240 | " \n", 1241 | " \n", 1242 | " \n", 1243 | " \n", 1244 | " \n", 1245 | " \n", 1246 | " \n", 1247 | " \n", 1248 | " \n", 1249 | " \n", 1250 | " \n", 1251 | " \n", 1252 | " \n", 1253 | " \n", 1254 | " \n", 1255 | " \n", 1256 | " \n", 1257 | " \n", 1258 | " \n", 1259 | " \n", 1260 | " \n", 1261 | " \n", 1262 | " \n", 1263 | " \n", 1264 | " \n", 1265 | " \n", 1266 | " \n", 1267 | " \n", 1268 | " \n", 1269 | " \n", 1270 | " \n", 1271 | " \n", 1272 | " \n", 1273 | " \n", 1274 | " \n", 1275 | " \n", 1276 | " \n", 1277 | " \n", 1278 | " \n", 1279 | " \n", 1280 | " \n", 1281 | " \n", 1282 | " \n", 1283 | " \n", 1284 | " \n", 1285 | " \n", 1286 | " \n", 1287 | " \n", 1288 | " \n", 1289 | " \n", 1290 | " \n", 1291 | " \n", 1292 | " \n", 1293 | " \n", 1294 | " \n", 1295 | " \n", 1296 | " \n", 1297 | " \n", 1298 | " \n", 1299 | " \n", 1300 | " \n", 1301 | " \n", 1302 | " \n", 1303 | " \n", 1304 | " \n", 1305 | " \n", 1306 | " \n", 1307 | " \n", 1308 | " \n", 1309 | " \n", 1310 | " \n", 1311 | " \n", 1312 | " \n", 1313 | " \n", 1314 | " \n", 1315 | " \n", 1316 | " \n", 1317 | " \n", 1318 | " \n", 1319 | " \n", 1320 | " \n", 1321 | " \n", 1322 | " \n", 1323 | " \n", 1324 | " \n", 1325 | " \n", 1326 | " \n", 1327 | " \n", 1328 | " \n", 1329 | " \n", 1330 | " \n", 1331 | " \n", 1332 | " \n", 1333 | " \n", 1334 | " \n", 1335 | " \n", 1336 | " \n", 1337 | " \n", 1338 | " \n", 1339 | " \n", 1340 | " \n", 1341 | " \n", 1342 | " \n", 1343 | " \n", 1344 | " \n", 1345 | " \n", 1346 | " \n", 1347 | " \n", 1348 | " \n", 1349 | " \n", 1350 | " \n", 1351 | " \n", 1352 | " \n", 1353 | " \n", 1354 | " \n", 1355 | " \n", 1356 | " \n", 1357 | " \n", 1358 | " \n", 1359 | " \n", 1360 | " \n", 1361 | " \n", 1362 | " \n", 1363 | " \n", 1364 | " \n", 1365 | " \n", 1366 | " \n", 1367 | " \n", 1368 | " \n", 1369 | " \n", 1370 | " \n", 1371 | " \n", 1372 | " \n", 1373 | " \n", 1374 | " \n", 1375 | " \n", 1376 | " \n", 1377 | " \n", 1378 | " \n", 1379 | " \n", 1380 | " \n", 1381 | " \n", 1382 | " \n", 1383 | " \n", 1384 | " \n", 1385 | " \n", 1386 | " \n", 1387 | " \n", 1388 | " \n", 1389 | " \n", 1390 | " \n", 1391 | " \n", 1392 | " \n", 1393 | " \n", 1394 | " \n", 1395 | " \n", 1396 | " \n", 1397 | " \n", 1398 | " \n", 1399 | " \n", 1400 | " \n", 1401 | " \n", 1402 | " \n", 1403 | " \n", 1404 | " \n", 1405 | " \n", 1406 | " \n", 1407 | " \n", 1408 | " \n", 1409 | " \n", 1410 | " \n", 1411 | " \n", 1412 | " \n", 1413 | " \n", 1414 | " \n", 1415 | " \n", 1416 | " \n", 1417 | " \n", 1418 | " \n", 1419 | " \n", 1420 | " \n", 1421 | " \n", 1422 | " \n", 1423 | " \n", 1424 | " \n", 1425 | " \n", 1426 | " \n", 1427 | " \n", 1428 | " \n", 1429 | " \n", 1430 | " \n", 1431 | " \n", 1432 | " \n", 1433 | " \n", 1434 | " \n", 1435 | " \n", 1436 | " \n", 1437 | " \n", 1438 | " \n", 1439 | " \n", 1440 | " \n", 1441 | " \n", 1442 | " \n", 1443 | " \n", 1444 | " \n", 1445 | " \n", 1446 | " \n", 1447 | " \n", 1448 | " \n", 1449 | " \n", 1450 | " \n", 1451 | " \n", 1452 | " \n", 1453 | " \n", 1454 | " \n", 1455 | " \n", 1456 | " \n", 1457 | " \n", 1458 | " \n", 1459 | " \n", 1460 | " \n", 1461 | " \n", 1462 | " \n", 1463 | " \n", 1464 | " \n", 1465 | " \n", 1466 | " \n", 1467 | " \n", 1468 | " \n", 1469 | " \n", 1470 | " \n", 1471 | " \n", 1472 | " \n", 1473 | " \n", 1474 | " \n", 1475 | " \n", 1476 | " \n", 1477 | " \n", 1478 | " \n", 1479 | " \n", 1480 | " \n", 1481 | " \n", 1482 | "
ACCESSIONNUMBERINDUSTRYGROUPTYPEINVESTMENTFUNDTYPEIS40ACTREVENUERANGEAGGREGATENETASSETVALUERANGEFEDERALEXEMPTIONS_ITEMS_LISTISAMENDMENTPREVIOUSACCESSIONNUMBERSALE_DATE...YEAROFINC_VALUE_ENTEREDACCESSIONNUMBER_2FILE_NUMFILING_DATESIC_CODESCHEMAVERSIONSUBMISSIONTYPETESTORLIVEOVER100PERSONSFLAGOVER100ISSUERFLAG
00001984536-23-000001Pooled Investment FundVenture Capital FundfalseDecline to DiscloseNone06b, 3C, 3C.1FalseNone2023-06-30...2023.00001984536-23-000001021-48676413-JUL-2023NoneX0708DLIVENoneNone
10001984537-23-000001Pooled Investment FundVenture Capital FundfalseDecline to DiscloseNone06b, 3C, 3C.1FalseNone2023-06-30...2023.00001984537-23-000001021-48676213-JUL-2023NoneX0708DLIVENoneNone
20001984773-23-000001Pooled Investment FundVenture Capital FundfalseDecline to DiscloseNone06b, 3C, 3C.1FalseNone2023-06-30...2023.00001984773-23-000001021-48675513-JUL-2023NoneX0708DLIVENoneNone
30001984539-23-000001Pooled Investment FundVenture Capital FundfalseDecline to DiscloseNone06b, 3C, 3C.1FalseNone2023-06-30...2023.00001984539-23-000001021-48675413-JUL-2023NoneX0708DLIVENoneNone
40001983466-23-000001Pooled Investment FundVenture Capital FundfalseDecline to DiscloseNone06b, 3C, 3C.1, 3C.7FalseNoneNone...2023.00001983466-23-000001021-48675113-JUL-2023NoneX0708DLIVENoneNone
..................................................................
479310001994032-23-000002Pooled Investment FundVenture Capital FundfalseDecline to DiscloseNone06b, 3C, 3C.1FalseNone2023-09-19...2023.00001994032-23-000002021-49252620-SEP-2023NoneX0708DLIVENoneNone
479320001972237-23-000003Pooled Investment FundVenture Capital FundfalseDecline to DiscloseNone06cTrue0001972237-23-0000022023-04-07...2023.00001972237-23-000003021-47911017-AUG-2023NoneX0708D/ALIVENoneNone
479330001986286-23-000002Pooled Investment FundVenture Capital FundfalseDecline to DiscloseNone06c, 3C, 3C.1FalseNone2023-08-09...2023.00001986286-23-000002021-48951911-AUG-2023NoneX0708DLIVENoneNone
479340001988652-23-000002Pooled Investment FundVenture Capital FundfalseDecline to DiscloseNone06b, 3C, 3C.1FalseNone2023-08-01...2023.00001988652-23-000002021-48875304-AUG-2023NoneX0708DLIVENoneNone
479350001988021-23-000001Pooled Investment FundVenture Capital FundfalseDecline to DiscloseNone06b, 3C, 3C.1FalseNone2023-07-25...2023.00001988021-23-000001021-48845401-AUG-2023NoneX0708DLIVENoneNone
\n", 1483 | "

47936 rows × 73 columns

\n", 1484 | "
" 1485 | ], 1486 | "text/plain": [ 1487 | " ACCESSIONNUMBER INDUSTRYGROUPTYPE INVESTMENTFUNDTYPE \\\n", 1488 | "0 0001984536-23-000001 Pooled Investment Fund Venture Capital Fund \n", 1489 | "1 0001984537-23-000001 Pooled Investment Fund Venture Capital Fund \n", 1490 | "2 0001984773-23-000001 Pooled Investment Fund Venture Capital Fund \n", 1491 | "3 0001984539-23-000001 Pooled Investment Fund Venture Capital Fund \n", 1492 | "4 0001983466-23-000001 Pooled Investment Fund Venture Capital Fund \n", 1493 | "... ... ... ... \n", 1494 | "47931 0001994032-23-000002 Pooled Investment Fund Venture Capital Fund \n", 1495 | "47932 0001972237-23-000003 Pooled Investment Fund Venture Capital Fund \n", 1496 | "47933 0001986286-23-000002 Pooled Investment Fund Venture Capital Fund \n", 1497 | "47934 0001988652-23-000002 Pooled Investment Fund Venture Capital Fund \n", 1498 | "47935 0001988021-23-000001 Pooled Investment Fund Venture Capital Fund \n", 1499 | "\n", 1500 | " IS40ACT REVENUERANGE AGGREGATENETASSETVALUERANGE \\\n", 1501 | "0 false Decline to Disclose None \n", 1502 | "1 false Decline to Disclose None \n", 1503 | "2 false Decline to Disclose None \n", 1504 | "3 false Decline to Disclose None \n", 1505 | "4 false Decline to Disclose None \n", 1506 | "... ... ... ... \n", 1507 | "47931 false Decline to Disclose None \n", 1508 | "47932 false Decline to Disclose None \n", 1509 | "47933 false Decline to Disclose None \n", 1510 | "47934 false Decline to Disclose None \n", 1511 | "47935 false Decline to Disclose None \n", 1512 | "\n", 1513 | " FEDERALEXEMPTIONS_ITEMS_LIST ISAMENDMENT PREVIOUSACCESSIONNUMBER \\\n", 1514 | "0 06b, 3C, 3C.1 False None \n", 1515 | "1 06b, 3C, 3C.1 False None \n", 1516 | "2 06b, 3C, 3C.1 False None \n", 1517 | "3 06b, 3C, 3C.1 False None \n", 1518 | "4 06b, 3C, 3C.1, 3C.7 False None \n", 1519 | "... ... ... ... \n", 1520 | "47931 06b, 3C, 3C.1 False None \n", 1521 | "47932 06c True 0001972237-23-000002 \n", 1522 | "47933 06c, 3C, 3C.1 False None \n", 1523 | "47934 06b, 3C, 3C.1 False None \n", 1524 | "47935 06b, 3C, 3C.1 False None \n", 1525 | "\n", 1526 | " SALE_DATE ... YEAROFINC_VALUE_ENTERED ACCESSIONNUMBER_2 \\\n", 1527 | "0 2023-06-30 ... 2023.0 0001984536-23-000001 \n", 1528 | "1 2023-06-30 ... 2023.0 0001984537-23-000001 \n", 1529 | "2 2023-06-30 ... 2023.0 0001984773-23-000001 \n", 1530 | "3 2023-06-30 ... 2023.0 0001984539-23-000001 \n", 1531 | "4 None ... 2023.0 0001983466-23-000001 \n", 1532 | "... ... ... ... ... \n", 1533 | "47931 2023-09-19 ... 2023.0 0001994032-23-000002 \n", 1534 | "47932 2023-04-07 ... 2023.0 0001972237-23-000003 \n", 1535 | "47933 2023-08-09 ... 2023.0 0001986286-23-000002 \n", 1536 | "47934 2023-08-01 ... 2023.0 0001988652-23-000002 \n", 1537 | "47935 2023-07-25 ... 2023.0 0001988021-23-000001 \n", 1538 | "\n", 1539 | " FILE_NUM FILING_DATE SIC_CODE SCHEMAVERSION \\\n", 1540 | "0 021-486764 13-JUL-2023 None X0708 \n", 1541 | "1 021-486762 13-JUL-2023 None X0708 \n", 1542 | "2 021-486755 13-JUL-2023 None X0708 \n", 1543 | "3 021-486754 13-JUL-2023 None X0708 \n", 1544 | "4 021-486751 13-JUL-2023 None X0708 \n", 1545 | "... ... ... ... ... \n", 1546 | "47931 021-492526 20-SEP-2023 None X0708 \n", 1547 | "47932 021-479110 17-AUG-2023 None X0708 \n", 1548 | "47933 021-489519 11-AUG-2023 None X0708 \n", 1549 | "47934 021-488753 04-AUG-2023 None X0708 \n", 1550 | "47935 021-488454 01-AUG-2023 None X0708 \n", 1551 | "\n", 1552 | " SUBMISSIONTYPE TESTORLIVE OVER100PERSONSFLAG OVER100ISSUERFLAG \n", 1553 | "0 D LIVE None None \n", 1554 | "1 D LIVE None None \n", 1555 | "2 D LIVE None None \n", 1556 | "3 D LIVE None None \n", 1557 | "4 D LIVE None None \n", 1558 | "... ... ... ... ... \n", 1559 | "47931 D LIVE None None \n", 1560 | "47932 D/A LIVE None None \n", 1561 | "47933 D LIVE None None \n", 1562 | "47934 D LIVE None None \n", 1563 | "47935 D LIVE None None \n", 1564 | "\n", 1565 | "[47936 rows x 73 columns]" 1566 | ] 1567 | }, 1568 | "execution_count": 6, 1569 | "metadata": {}, 1570 | "output_type": "execute_result" 1571 | } 1572 | ], 1573 | "source": [ 1574 | "#This query joins the three data frames together \n", 1575 | "\n", 1576 | "joins = \"\"\"\n", 1577 | "select * from offerings as o \n", 1578 | "left join issuers as i on i.ACCESSIONNUMBER = o.ACCESSIONNUMBER\n", 1579 | "left join form_d_submission as fds on fds.ACCESSIONNUMBER = o.ACCESSIONNUMBER\n", 1580 | "\"\"\"\n", 1581 | "\n", 1582 | "where = \"\"\"\n", 1583 | "where INVESTMENTFUNDTYPE = 'Venture Capital Fund'\n", 1584 | "\"\"\"\n", 1585 | "\n", 1586 | "query = joins + where\n", 1587 | "\n", 1588 | "df = db.query(f'{query}').df()\n", 1589 | "df" 1590 | ] 1591 | }, 1592 | { 1593 | "cell_type": "code", 1594 | "execution_count": 7, 1595 | "id": "676f0a50", 1596 | "metadata": {}, 1597 | "outputs": [], 1598 | "source": [ 1599 | "def normalize_date(date_str):\n", 1600 | " if pd.isna(date_str):\n", 1601 | " return np.nan\n", 1602 | " try:\n", 1603 | " # Try to parse the date string\n", 1604 | " parsed_date = parser.parse(date_str)\n", 1605 | " # Return the date in a standard format\n", 1606 | " return parsed_date.strftime('%Y-%m-%d')\n", 1607 | " except:\n", 1608 | " # If parsing fails, return NaN or the original string\n", 1609 | " return np.nan # or return date_str if you prefer\n", 1610 | "\n", 1611 | "# Apply the function to your DataFrame column\n", 1612 | "df['normalized_date'] = df['FILING_DATE'].apply(normalize_date)" 1613 | ] 1614 | }, 1615 | { 1616 | "cell_type": "code", 1617 | "execution_count": 8, 1618 | "id": "9538fe76", 1619 | "metadata": {}, 1620 | "outputs": [], 1621 | "source": [ 1622 | "#Filter out prior filings for the same fund\n", 1623 | "previous_accession_numbers = set(df['PREVIOUSACCESSIONNUMBER'].dropna())\n", 1624 | "vc = df[~df['ACCESSIONNUMBER'].isin(previous_accession_numbers)]" 1625 | ] 1626 | }, 1627 | { 1628 | "cell_type": "code", 1629 | "execution_count": 9, 1630 | "id": "b0188141", 1631 | "metadata": { 1632 | "scrolled": true 1633 | }, 1634 | "outputs": [ 1635 | { 1636 | "data": { 1637 | "text/html": [ 1638 | "
\n", 1639 | "\n", 1652 | "\n", 1653 | " \n", 1654 | " \n", 1655 | " \n", 1656 | " \n", 1657 | " \n", 1658 | " \n", 1659 | " \n", 1660 | " \n", 1661 | " \n", 1662 | " \n", 1663 | " \n", 1664 | " \n", 1665 | " \n", 1666 | " \n", 1667 | " \n", 1668 | " \n", 1669 | " \n", 1670 | " \n", 1671 | " \n", 1672 | " \n", 1673 | " \n", 1674 | " \n", 1675 | " \n", 1676 | " \n", 1677 | " \n", 1678 | " \n", 1679 | " \n", 1680 | " \n", 1681 | " \n", 1682 | " \n", 1683 | " \n", 1684 | " \n", 1685 | " \n", 1686 | " \n", 1687 | " \n", 1688 | " \n", 1689 | " \n", 1690 | " \n", 1691 | " \n", 1692 | " \n", 1693 | " \n", 1694 | " \n", 1695 | " \n", 1696 | " \n", 1697 | " \n", 1698 | " \n", 1699 | " \n", 1700 | " \n", 1701 | " \n", 1702 | " \n", 1703 | " \n", 1704 | " \n", 1705 | " \n", 1706 | " \n", 1707 | " \n", 1708 | " \n", 1709 | " \n", 1710 | " \n", 1711 | " \n", 1712 | " \n", 1713 | " \n", 1714 | " \n", 1715 | " \n", 1716 | " \n", 1717 | " \n", 1718 | " \n", 1719 | " \n", 1720 | " \n", 1721 | " \n", 1722 | " \n", 1723 | " \n", 1724 | " \n", 1725 | " \n", 1726 | " \n", 1727 | " \n", 1728 | " \n", 1729 | " \n", 1730 | " \n", 1731 | " \n", 1732 | " \n", 1733 | " \n", 1734 | " \n", 1735 | " \n", 1736 | " \n", 1737 | " \n", 1738 | " \n", 1739 | " \n", 1740 | " \n", 1741 | " \n", 1742 | " \n", 1743 | " \n", 1744 | " \n", 1745 | " \n", 1746 | " \n", 1747 | " \n", 1748 | " \n", 1749 | " \n", 1750 | " \n", 1751 | " \n", 1752 | " \n", 1753 | " \n", 1754 | " \n", 1755 | " \n", 1756 | " \n", 1757 | " \n", 1758 | " \n", 1759 | " \n", 1760 | " \n", 1761 | " \n", 1762 | " \n", 1763 | " \n", 1764 | " \n", 1765 | " \n", 1766 | " \n", 1767 | " \n", 1768 | " \n", 1769 | " \n", 1770 | " \n", 1771 | " \n", 1772 | " \n", 1773 | " \n", 1774 | " \n", 1775 | " \n", 1776 | " \n", 1777 | " \n", 1778 | " \n", 1779 | " \n", 1780 | " \n", 1781 | " \n", 1782 | " \n", 1783 | " \n", 1784 | " \n", 1785 | " \n", 1786 | " \n", 1787 | " \n", 1788 | " \n", 1789 | " \n", 1790 | " \n", 1791 | " \n", 1792 | " \n", 1793 | " \n", 1794 | " \n", 1795 | " \n", 1796 | " \n", 1797 | " \n", 1798 | " \n", 1799 | " \n", 1800 | " \n", 1801 | " \n", 1802 | " \n", 1803 | " \n", 1804 | " \n", 1805 | " \n", 1806 | " \n", 1807 | " \n", 1808 | " \n", 1809 | " \n", 1810 | " \n", 1811 | " \n", 1812 | " \n", 1813 | " \n", 1814 | " \n", 1815 | " \n", 1816 | " \n", 1817 | " \n", 1818 | " \n", 1819 | " \n", 1820 | " \n", 1821 | " \n", 1822 | " \n", 1823 | " \n", 1824 | " \n", 1825 | " \n", 1826 | " \n", 1827 | " \n", 1828 | " \n", 1829 | " \n", 1830 | " \n", 1831 | " \n", 1832 | " \n", 1833 | " \n", 1834 | " \n", 1835 | " \n", 1836 | " \n", 1837 | " \n", 1838 | " \n", 1839 | " \n", 1840 | " \n", 1841 | " \n", 1842 | " \n", 1843 | " \n", 1844 | " \n", 1845 | " \n", 1846 | " \n", 1847 | " \n", 1848 | " \n", 1849 | " \n", 1850 | " \n", 1851 | " \n", 1852 | " \n", 1853 | " \n", 1854 | " \n", 1855 | " \n", 1856 | " \n", 1857 | " \n", 1858 | " \n", 1859 | " \n", 1860 | " \n", 1861 | " \n", 1862 | " \n", 1863 | " \n", 1864 | " \n", 1865 | " \n", 1866 | " \n", 1867 | " \n", 1868 | " \n", 1869 | " \n", 1870 | " \n", 1871 | " \n", 1872 | " \n", 1873 | " \n", 1874 | " \n", 1875 | " \n", 1876 | " \n", 1877 | " \n", 1878 | " \n", 1879 | " \n", 1880 | " \n", 1881 | " \n", 1882 | " \n", 1883 | " \n", 1884 | " \n", 1885 | " \n", 1886 | " \n", 1887 | " \n", 1888 | " \n", 1889 | " \n", 1890 | " \n", 1891 | " \n", 1892 | " \n", 1893 | " \n", 1894 | " \n", 1895 | " \n", 1896 | " \n", 1897 | " \n", 1898 | " \n", 1899 | " \n", 1900 | " \n", 1901 | " \n", 1902 | " \n", 1903 | " \n", 1904 | " \n", 1905 | " \n", 1906 | " \n", 1907 | " \n", 1908 | " \n", 1909 | " \n", 1910 | " \n", 1911 | " \n", 1912 | " \n", 1913 | " \n", 1914 | " \n", 1915 | " \n", 1916 | " \n", 1917 | " \n", 1918 | " \n", 1919 | " \n", 1920 | " \n", 1921 | " \n", 1922 | " \n", 1923 | " \n", 1924 | " \n", 1925 | " \n", 1926 | " \n", 1927 | " \n", 1928 | " \n", 1929 | " \n", 1930 | " \n", 1931 | " \n", 1932 | " \n", 1933 | " \n", 1934 | " \n", 1935 | " \n", 1936 | " \n", 1937 | " \n", 1938 | " \n", 1939 | " \n", 1940 | " \n", 1941 | " \n", 1942 | " \n", 1943 | " \n", 1944 | " \n", 1945 | "
ACCESSIONNUMBERINDUSTRYGROUPTYPEINVESTMENTFUNDTYPEIS40ACTREVENUERANGEAGGREGATENETASSETVALUERANGEFEDERALEXEMPTIONS_ITEMS_LISTISAMENDMENTPREVIOUSACCESSIONNUMBERSALE_DATE...ACCESSIONNUMBER_2FILE_NUMFILING_DATESIC_CODESCHEMAVERSIONSUBMISSIONTYPETESTORLIVEOVER100PERSONSFLAGOVER100ISSUERFLAGnormalized_date
00001984536-23-000001Pooled Investment FundVenture Capital FundfalseDecline to DiscloseNaN06b, 3C, 3C.1FalseNone2023-06-30...0001984536-23-000001021-48676413-JUL-2023NoneX0708DLIVENaNNaN2023-07-13
10001984537-23-000001Pooled Investment FundVenture Capital FundfalseDecline to DiscloseNaN06b, 3C, 3C.1FalseNone2023-06-30...0001984537-23-000001021-48676213-JUL-2023NoneX0708DLIVENaNNaN2023-07-13
20001984773-23-000001Pooled Investment FundVenture Capital FundfalseDecline to DiscloseNaN06b, 3C, 3C.1FalseNone2023-06-30...0001984773-23-000001021-48675513-JUL-2023NoneX0708DLIVENaNNaN2023-07-13
30001984539-23-000001Pooled Investment FundVenture Capital FundfalseDecline to DiscloseNaN06b, 3C, 3C.1FalseNone2023-06-30...0001984539-23-000001021-48675413-JUL-2023NoneX0708DLIVENaNNaN2023-07-13
40001983466-23-000001Pooled Investment FundVenture Capital FundfalseDecline to DiscloseNaN06b, 3C, 3C.1, 3C.7FalseNoneNone...0001983466-23-000001021-48675113-JUL-2023NoneX0708DLIVENaNNaN2023-07-13
..................................................................
409340001994032-23-000002Pooled Investment FundVenture Capital FundfalseDecline to DiscloseNaN06b, 3C, 3C.1FalseNone2023-09-19...0001994032-23-000002021-49252620-SEP-2023NoneX0708DLIVENaNNaN2023-09-20
409350001972237-23-000003Pooled Investment FundVenture Capital FundfalseDecline to DiscloseNaN06cTrue0001972237-23-0000022023-04-07...0001972237-23-000003021-47911017-AUG-2023NoneX0708D/ALIVENaNNaN2023-08-17
409360001986286-23-000002Pooled Investment FundVenture Capital FundfalseDecline to DiscloseNaN06c, 3C, 3C.1FalseNone2023-08-09...0001986286-23-000002021-48951911-AUG-2023NoneX0708DLIVENaNNaN2023-08-11
409370001988652-23-000002Pooled Investment FundVenture Capital FundfalseDecline to DiscloseNaN06b, 3C, 3C.1FalseNone2023-08-01...0001988652-23-000002021-48875304-AUG-2023NoneX0708DLIVENaNNaN2023-08-04
409380001988021-23-000001Pooled Investment FundVenture Capital FundfalseDecline to DiscloseNaN06b, 3C, 3C.1FalseNone2023-07-25...0001988021-23-000001021-48845401-AUG-2023NoneX0708DLIVENaNNaN2023-08-01
\n", 1946 | "

40939 rows × 74 columns

\n", 1947 | "
" 1948 | ], 1949 | "text/plain": [ 1950 | " ACCESSIONNUMBER INDUSTRYGROUPTYPE INVESTMENTFUNDTYPE \\\n", 1951 | "0 0001984536-23-000001 Pooled Investment Fund Venture Capital Fund \n", 1952 | "1 0001984537-23-000001 Pooled Investment Fund Venture Capital Fund \n", 1953 | "2 0001984773-23-000001 Pooled Investment Fund Venture Capital Fund \n", 1954 | "3 0001984539-23-000001 Pooled Investment Fund Venture Capital Fund \n", 1955 | "4 0001983466-23-000001 Pooled Investment Fund Venture Capital Fund \n", 1956 | "... ... ... ... \n", 1957 | "40934 0001994032-23-000002 Pooled Investment Fund Venture Capital Fund \n", 1958 | "40935 0001972237-23-000003 Pooled Investment Fund Venture Capital Fund \n", 1959 | "40936 0001986286-23-000002 Pooled Investment Fund Venture Capital Fund \n", 1960 | "40937 0001988652-23-000002 Pooled Investment Fund Venture Capital Fund \n", 1961 | "40938 0001988021-23-000001 Pooled Investment Fund Venture Capital Fund \n", 1962 | "\n", 1963 | " IS40ACT REVENUERANGE AGGREGATENETASSETVALUERANGE \\\n", 1964 | "0 false Decline to Disclose NaN \n", 1965 | "1 false Decline to Disclose NaN \n", 1966 | "2 false Decline to Disclose NaN \n", 1967 | "3 false Decline to Disclose NaN \n", 1968 | "4 false Decline to Disclose NaN \n", 1969 | "... ... ... ... \n", 1970 | "40934 false Decline to Disclose NaN \n", 1971 | "40935 false Decline to Disclose NaN \n", 1972 | "40936 false Decline to Disclose NaN \n", 1973 | "40937 false Decline to Disclose NaN \n", 1974 | "40938 false Decline to Disclose NaN \n", 1975 | "\n", 1976 | " FEDERALEXEMPTIONS_ITEMS_LIST ISAMENDMENT PREVIOUSACCESSIONNUMBER \\\n", 1977 | "0 06b, 3C, 3C.1 False None \n", 1978 | "1 06b, 3C, 3C.1 False None \n", 1979 | "2 06b, 3C, 3C.1 False None \n", 1980 | "3 06b, 3C, 3C.1 False None \n", 1981 | "4 06b, 3C, 3C.1, 3C.7 False None \n", 1982 | "... ... ... ... \n", 1983 | "40934 06b, 3C, 3C.1 False None \n", 1984 | "40935 06c True 0001972237-23-000002 \n", 1985 | "40936 06c, 3C, 3C.1 False None \n", 1986 | "40937 06b, 3C, 3C.1 False None \n", 1987 | "40938 06b, 3C, 3C.1 False None \n", 1988 | "\n", 1989 | " SALE_DATE ... ACCESSIONNUMBER_2 FILE_NUM \\\n", 1990 | "0 2023-06-30 ... 0001984536-23-000001 021-486764 \n", 1991 | "1 2023-06-30 ... 0001984537-23-000001 021-486762 \n", 1992 | "2 2023-06-30 ... 0001984773-23-000001 021-486755 \n", 1993 | "3 2023-06-30 ... 0001984539-23-000001 021-486754 \n", 1994 | "4 None ... 0001983466-23-000001 021-486751 \n", 1995 | "... ... ... ... ... \n", 1996 | "40934 2023-09-19 ... 0001994032-23-000002 021-492526 \n", 1997 | "40935 2023-04-07 ... 0001972237-23-000003 021-479110 \n", 1998 | "40936 2023-08-09 ... 0001986286-23-000002 021-489519 \n", 1999 | "40937 2023-08-01 ... 0001988652-23-000002 021-488753 \n", 2000 | "40938 2023-07-25 ... 0001988021-23-000001 021-488454 \n", 2001 | "\n", 2002 | " FILING_DATE SIC_CODE SCHEMAVERSION SUBMISSIONTYPE TESTORLIVE \\\n", 2003 | "0 13-JUL-2023 None X0708 D LIVE \n", 2004 | "1 13-JUL-2023 None X0708 D LIVE \n", 2005 | "2 13-JUL-2023 None X0708 D LIVE \n", 2006 | "3 13-JUL-2023 None X0708 D LIVE \n", 2007 | "4 13-JUL-2023 None X0708 D LIVE \n", 2008 | "... ... ... ... ... ... \n", 2009 | "40934 20-SEP-2023 None X0708 D LIVE \n", 2010 | "40935 17-AUG-2023 None X0708 D/A LIVE \n", 2011 | "40936 11-AUG-2023 None X0708 D LIVE \n", 2012 | "40937 04-AUG-2023 None X0708 D LIVE \n", 2013 | "40938 01-AUG-2023 None X0708 D LIVE \n", 2014 | "\n", 2015 | " OVER100PERSONSFLAG OVER100ISSUERFLAG normalized_date \n", 2016 | "0 NaN NaN 2023-07-13 \n", 2017 | "1 NaN NaN 2023-07-13 \n", 2018 | "2 NaN NaN 2023-07-13 \n", 2019 | "3 NaN NaN 2023-07-13 \n", 2020 | "4 NaN NaN 2023-07-13 \n", 2021 | "... ... ... ... \n", 2022 | "40934 NaN NaN 2023-09-20 \n", 2023 | "40935 NaN NaN 2023-08-17 \n", 2024 | "40936 NaN NaN 2023-08-11 \n", 2025 | "40937 NaN NaN 2023-08-04 \n", 2026 | "40938 NaN NaN 2023-08-01 \n", 2027 | "\n", 2028 | "[40939 rows x 74 columns]" 2029 | ] 2030 | }, 2031 | "execution_count": 9, 2032 | "metadata": {}, 2033 | "output_type": "execute_result" 2034 | } 2035 | ], 2036 | "source": [ 2037 | "# Create a new DuckDB database file\n", 2038 | "vc_database = db.connect('vc_database.duckdb')\n", 2039 | "\n", 2040 | "# Execute the join query and write the results directly to the new database\n", 2041 | "vc_database.execute(f\"\"\"\n", 2042 | "CREATE TABLE vc_data AS\n", 2043 | "select * from vc\n", 2044 | "\"\"\")\n", 2045 | "\n", 2046 | "vc_database.execute(f'select * from vc').df()" 2047 | ] 2048 | }, 2049 | { 2050 | "cell_type": "code", 2051 | "execution_count": 10, 2052 | "id": "07d29bff", 2053 | "metadata": {}, 2054 | "outputs": [], 2055 | "source": [ 2056 | "# Close the connection\n", 2057 | "vc_database.close()" 2058 | ] 2059 | } 2060 | ], 2061 | "metadata": { 2062 | "kernelspec": { 2063 | "display_name": "Python 3 (ipykernel)", 2064 | "language": "python", 2065 | "name": "python3" 2066 | }, 2067 | "language_info": { 2068 | "codemirror_mode": { 2069 | "name": "ipython", 2070 | "version": 3 2071 | }, 2072 | "file_extension": ".py", 2073 | "mimetype": "text/x-python", 2074 | "name": "python", 2075 | "nbconvert_exporter": "python", 2076 | "pygments_lexer": "ipython3", 2077 | "version": "3.11.1" 2078 | } 2079 | }, 2080 | "nbformat": 4, 2081 | "nbformat_minor": 5 2082 | } 2083 | -------------------------------------------------------------------------------- /Jupyter DuckDB SEC From D Example.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # coding: utf-8 3 | 4 | # In[1]: 5 | 6 | 7 | #Imports 8 | import pandas as pd 9 | import duckdb as db 10 | import numpy as np 11 | from dateutil import parser 12 | 13 | 14 | # In[2]: 15 | 16 | 17 | """ 18 | Specify path where form D files are stored. 19 | These can be downloaded from https://www.sec.gov/data-research/sec-markets-data/form-d-data-sets. 20 | This notebook is structured to have each folder from the SEC data set stored in the folder Form_D_data. 21 | """ 22 | path = "../Form_D_data" 23 | 24 | 25 | # In[3]: 26 | 27 | 28 | #Load the offerings TSVs from every folder in the Form D directory 29 | offerings = db.read_csv(f"{path}/*/OFFERING.tsv", dtype={'TOTALOFFERINGAMOUNT': 'VARCHAR','TOTALREMAINING': 'VARCHAR'}).df() 30 | offerings 31 | 32 | 33 | # In[4]: 34 | 35 | 36 | #Load the issuers TSVs from every folder in the Form D directory 37 | issuers = db.read_csv(f"{path}/*/ISSUERS.tsv", dtype={'ISSUERPHONENUMBER':'VARCHAR','ZIPCODE':'VARCHAR'}).df() 38 | issuers 39 | 40 | 41 | # In[5]: 42 | 43 | 44 | #Load the form_d_submissions TSVs from every folder in the Form D directory 45 | form_d_submission = db.read_csv(f"{path}/*/FORMDSUBMISSION.tsv", dtype={'FILING_DATE':'VARCHAR'}).df() 46 | form_d_submission 47 | 48 | 49 | # In[6]: 50 | 51 | 52 | #This query joins the three data frames together 53 | 54 | joins = """ 55 | select * from offerings as o 56 | left join issuers as i on i.ACCESSIONNUMBER = o.ACCESSIONNUMBER 57 | left join form_d_submission as fds on fds.ACCESSIONNUMBER = o.ACCESSIONNUMBER 58 | """ 59 | 60 | where = """ 61 | where INVESTMENTFUNDTYPE = 'Venture Capital Fund' 62 | """ 63 | 64 | query = joins + where 65 | 66 | df = db.query(f'{query}').df() 67 | df 68 | 69 | 70 | # In[7]: 71 | 72 | 73 | def normalize_date(date_str): 74 | if pd.isna(date_str): 75 | return np.nan 76 | try: 77 | # Try to parse the date string 78 | parsed_date = parser.parse(date_str) 79 | # Return the date in a standard format 80 | return parsed_date.strftime('%Y-%m-%d') 81 | except: 82 | # If parsing fails, return NaN or the original string 83 | return np.nan # or return date_str if you prefer 84 | 85 | # Apply the function to your DataFrame column 86 | df['normalized_date'] = df['FILING_DATE'].apply(normalize_date) 87 | 88 | 89 | # In[8]: 90 | 91 | 92 | #Filter out prior filings for the same fund 93 | previous_accession_numbers = set(df['PREVIOUSACCESSIONNUMBER'].dropna()) 94 | vc = df[~df['ACCESSIONNUMBER'].isin(previous_accession_numbers)] 95 | 96 | 97 | # In[9]: 98 | 99 | 100 | # Create a new DuckDB database file 101 | vc_database = db.connect('vc_database.duckdb') 102 | 103 | # Execute the join query and write the results directly to the new database 104 | vc_database.execute(f""" 105 | CREATE TABLE vc_data AS 106 | select * from vc 107 | """) 108 | 109 | vc_database.execute(f'select * from vc').df() 110 | 111 | 112 | # In[10]: 113 | 114 | 115 | # Close the connection 116 | vc_database.close() 117 | 118 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2024 Ethan Finkel 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # SEC_Form_D_Evidence 2 | Open source repo from my blog post: https://www.ethanfinkel.com/posts/evidence-project. 3 | 4 | For a demo of this project, I have it hosted on my website at: https://data.ethanfinkel.com/. 5 | 6 | This repo contains both the data prep and display layer for analyzing VC Form D SEC filings. 7 | 8 | ## File Prep 9 | To get Form D data, download TSVs from the link below and drop them in the empty Form_D_Data folder in this repo. 10 | 11 | Link: https://www.sec.gov/data-research/sec-markets-data/form-d-data-sets. 12 | 13 | ## Data Prep 14 | In the Jupyter notebook and python file, there is code to combine TSV files, clean, and join the data. The end product of running all of the code is a DuckDB formatted file called vc_database.duckdb. I've already created this file and added it as a source for Evidence.dev, but you can re-create that file using the notebook or python file. 15 | 16 | The data is prepped with DuckDB with some light datacleaning using Pandas, Numpy, and dateutil. 17 | 18 | ## Evidence 19 | Run the Evidence project by installing Evidence and starting your dev server. For instructions on installing Evidence and running code locally, check out their documentation here: https://docs.evidence.dev/install-evidence/. Since this project is already created and setup, you can just utalize the Start Evidence command rather than creating a new project. 20 | 21 | All of the code for the Evidence project is in the Evidence/pages/index.md file. Feel free to change this as you see fit and check out updates in your local npm server on localhost:3000. 22 | 23 | ## Closing 24 | Hope people enjoy this open source example of prepping and cleaning data using DuckDB and visualizing it in Evidence.dev. Feel free to leave some comments and reach out to me on Twitter here: https://twitter.com/ethanf_17 25 | --------------------------------------------------------------------------------