Home Unstructured

Unstructured

Unstructured is an open-source ETL solution that converts complex documents into structured data for language models, featuring enterprise-grade capabilities like workflow orchestration, document partitioning, enrichment, chunking, and embedding.

Language
HTML
Latest Release
0.18.22
License
Apache License 2.0

Our Newsletter

Get new AI tools right in your inbox

Get short emails with useful ai projects, releases, and repos worth watching.


Key Features

  • Transforms unstructured documents into structured data
  • Enterprise-grade workflow automation
  • Supports partitioning, enrichment, and embedding
  • Optimized for extracting data for language models
  • Open-source and production ready

Alternative Tools

LangChainApache TikadoccanoHaystack


Community

Stars
13.4k
Open Issues
233
Forks
1.1k