
1 stock code →
full company story
Korean DART + US SEC EDGAR filings, pre-structured.
Datasets hosted on Hugging Face. One line of Python.
Already Structured, Ready to Use
See how listed companies
are connected — in 10 seconds
Every listed company · every industry · every supply-chain relationship on one screen. Click a bubble to drill into an industry, select a company for 5-year financials, supply-chain HHI, AI analysis, and deep-dive reports — all in a single card. Automatic detection of sudden changes this fiscal year.
dartlab visualizes disclosure and financial data. Not investment advice. Always cross-check with original DART filings and analyst reports.
The Truth About Every Company Is Already Public
It's just unreadable. Annual reports are 200+ pages, quarterly filings pile up, and the data you need is buried across documents, formats, and years.

# 1. Download PDF from DART
pdf = download_report("005930", "2024")
# 2. Extract tables from PDF
tables = parse_pdf_tables(pdf)
# 3. Manual account mapping
mapped = manual_map(tables, my_schema)
# 4. Repeat for each quarter...
# 5. Repeat for each company...
# 6. Hope the formats matchimport dartlab
c = dartlab.Company("005930")
c.BS # standardized balance sheet
c.ratios # 47 financial ratios
c.diff() # 5 years of changes34,249 Account Mappings. Zero Manual Work.
Every company files with different XBRL account IDs. DartLab normalizes them through a 4-step pipeline so cross-company comparison works automatically.
revenue — cross-company comparison just worksFrom Vertical Filings to One Horizontal Map
The real product is not a parser list. It is the map.
One Company, Four Namespaces
The map stays the same. Only the source responsibility changes.
Structural Spine
Owns section boundaries, narrative payloads, retrieval blocks, and the raw evidence layer that keeps the company map grounded in the filings.
Authoritative Numbers
Owns balance sheet, income statement, cash flow, ratios, and comparable time series. When numbers are available here, they should win.
Structured Disclosure
Owns structured disclosure APIs such as audit, dividend, employees, executives, and similar periodic report payloads where docs should not be the first authority.
Merged Company Layer
What the user sees by default: one company surface built on the same sections spine, ready for Python workflows now and AI interfaces next.
From Stock Code to Company Map
No parser inventory first. Start from the company board.
Install
One line — uv add dartlab. No separate data preparation needed.
Create Company
Start from the public entrypoint. Missing data is auto-downloaded from HuggingFace.
sections = the company
One DataFrame with every topic and every period. show, diff, trace are just views on top.
sections is the whole company
One DataFrame. Every topic. Every period. Here's what you actually get.
samsung = Company("005930")
board = samsung.sections
board # canonical company map
board.shape # (329, 106)
| chapter | topic | blockType | 2024 | 2023 |
|---|---|---|---|---|
| I | companyOverview | text | Founded in 1969… | Founded in 1969… |
| II | businessOverview | text | Semiconductors… | Semiconductors… |
| II | businessOverview | table | Revenue (5×3) | Revenue (5×3) |
| III | riskManagement | text | FX risk… | FX risk… |
| V | auditOpinion | text | Unqualified | Unqualified |
No Code Required. Just Ask.
DartLab structures the data and feeds it to the LLM. You ask questions in plain language — from your terminal or Python.
The 2-tier architecture feeds structured company data to any LLM. Basic analysis works with every provider. Tool-calling providers go deeper.
Samsung Electronics — Actual Output
What you get from Company("005930") out of the box
| chapter | topic | blockType | 2024 | 2023 | 2022 |
|---|---|---|---|---|---|
| I | companyOverview | text | Founded in 1969… | Founded in 1969… | Founded in 1969… |
| II | businessOverview | text | Semiconductors, display… | Semiconductors, display… | Semiconductors, display… |
| II | businessOverview | table | Revenue mix (5×3) | Revenue mix (5×3) | Revenue mix (5×3) |
| III | riskManagement | text | FX risk exposure… | FX risk exposure… | — |
| V | auditOpinion | text | Unqualified | Unqualified | Unqualified |
42 Modules, One Structure
All modules sit on the same sections spine. No separate schemas.
Narrative structure, section boundaries, retrieval blocks
DART + EDGAR — Same Interface
Korean DART and US SEC EDGAR through one Company interface
c = Company("005930")
c.sections
c.show("businessOverview")
c.BS
c.ratios
c.diff("businessOverview")
c.insights.grades()c = Company("AAPL")
c.sections
c.show("business")
c.BS
c.ratios
c.diff("10-K::item7Mdna")
c.insights.grades()| Feature | DART | EDGAR |
|---|---|---|
| sections horizontalization | ✓ | ✓ |
| show(topic) | ✓ | ✓ |
| trace(topic) | ✓ | ✓ |
| diff(topic) | ✓ | ✓ |
| BS · IS · CF normalization | ✓ | ✓ |
| ratios time series | ✓ | ✓ |
| timeseries | ✓ | ✓ |
| report API (28 types) | ✓ | — |
| insights (7-area grading) | ✓ | — |
| sector classification | ✓ | — |
| market ranking | ✓ | — |
| AI company analysis | ✓ | ✓ |
| Excel export | ✓ | ✓ |
| Desktop GUI | ✓ | ✓ |
Company("005930") for DART · Company("AAPL") for EDGAR — same interface, same methods
Fast Because It's Simple
Polars + Parquet + one structure = no unnecessary conversion
One structure. All queries run on sections — no data conversion needed.
Polars. 5-10x faster DataFrame operations than Pandas.
Parquet. Columnar format reads only the columns you need.
Transparent Stability Tiers
Clear about what's stable and what's experimental
- Company facade
- DART sections / show / trace / diff
- DART docs / finance / report
- search / listing
- BS · IS · CF · ratios · timeseries
- EDGAR Company (sections, finance)
- insights (7-area grading)
- rank / sector
- Excel export
- Server API (FastAPI 40+ endpoints)
- MCP server (60 tools)
- AI analysis (7 providers)
- AI GUI (Desktop)
- Network scanner (new)
Roadmap
- sections text structure
- EDGAR sections 100%
- Network scanner
- profile.sections merged view
- TopicView implementation
- show() completion
- EDINET engine
- AI GUI improvements
- Rust pipeline (sections)
Questions DartLab Answers
Every question starts from the same company map. No glue code, no context switching.
"What's their real credit risk?"
Independent credit evaluation (dCR) rebuilt from disclosure data alone — repayment capacity, capital structure, liquidity, cash-flow quality, disclosure risk. No agency rating, no black box. One call: c.credit().
"Show me the numbers in context"
Financial statements alone miss half the story. DartLab puts BS/IS/CF next to the narrative that explains them — same company, same timeline, same object.
"Screen the entire market"
Scan all 2,700 listed companies by governance quality, workforce trends, capital returns, or debt risk. One call: dartlab.governance("all"). Filter, rank, compare.
"Build a research dataset"
Standardized text + financial data across hundreds of companies. Ready for NLP, ML training, or academic research. No cleaning, no alignment — already done.
"Let AI analyze with real evidence"
Feed structured company context to any LLM — not raw PDFs. 7 providers supported. The AI reasons over actual disclosure data, not hallucinated summaries.
"One tool for Korea and US"
Same Company interface for Korean DART and US EDGAR. Learn it once, apply it to both markets. Compare Samsung and Apple with the same API.
Installation
Start analyzing right after install
No separate data preparation needed. Pass a stock code and missing data is automatically downloaded from HuggingFace.
3줄이면 끝
종목코드 하나면 회사 이름·공시 상태·전 분기 재무제표가 자동으로 딸려온다.
import
dartlab
c = dartlab.Company("005930") # 삼성전자
c.show("IS") # 손익계산서 전 분기
c.show("CF") # 현금흐름표 — 문자열만 바꾸면 끝Colab · Molab · 로컬 마리모 — 같은 코드를 세 경로로 바로 돌려볼 수 있다.
실습 노트북
11개 주제, Colab · Molab · 로컬 마리모 — 같은 코드로 돌려볼 수 있다.
Marimo 노트북은 로컬에서 편집하는 게 가장 빠르다. 아래 명령어를 실행하면 브라우저에 편집기가 자동으로 뜬다. 파일 이름만 바꾸면 다른 노트북도 같은 방식.
Marimo 는 코드만, Colab 은 마크다운 설명 + 코드 — 같은 구성을 두 포맷으로 유지한다.

Start Reading Companies, Not PDFs
One stock code. Every filing structured. Every period comparable.
One line of Python gives you what used to take days of PDF reading.