Affordable PDF & Web → Markdown

Convert PDFs, websites, and images to clean Markdown — at 90% lower cost.

Built for developers and startups feeding large models. Async by design — not the fastest, significantly cheaper. $0.005 per PDF page. First 1,000 pages free.

No file retention 60,000+ developers served 5-minute integration
cobbling.convert(report-q3.pdf)
Input · PDF
Output · Markdown
 
4 pages · 12.4 KB · OCR parsed in 38s
Trusted by 60,000+ teams & developers
Brand partner 1Brand partner 2Brand partner 3
The Cobbling facts

Honest numbers, not hyperbole.

We're slower than premium APIs and significantly cheaper. The trade-off is the entire pitch.

90 %
Lower cost than premium parsers
VS. AVERAGE OF 5 LEADING APIS
$0.005 /pg
PDF to Markdown
FIRST 1,000 PAGES FREE
60K +
Developers served since 2023
ACROSS 40+ COUNTRIES
5 min
From signup to first conversion
REST · WEBHOOKS · SDKS
How it works

Submit, poll, retrieve. That's it.

An async pipeline tuned for batch jobs and large-model ingestion. No persistent connections, no streaming, no surprise bills.

01

Submit a file or URL

POST a PDF, image, or web URL. We accept files up to 50 MB and pages up to ~1,000.

POST /v1/jobs
02

We parse it asynchronously

OCR, layout reconstruction, table detection. Cost-optimized models, not the fastest — by design.

status: processing
03

Webhook or poll

Get notified the second a job finishes, or query the status endpoint on your own schedule.

POST webhook · or GET /jobs/:id
04

Retrieve clean Markdown

Headings, lists, tables, code blocks — preserved. Stored 5 days, then deleted forever.

GET /jobs/:id/result
For developers

A single endpoint.
A predictable response.

No SDK gymnastics. Cobbling exposes one REST surface that maps cleanly onto any HTTP client. Use webhooks if you have them; poll if you don't.

  • HTTP & webhooks. No long-lived connections, no streaming protocol to learn.
  • Idempotent jobs. Retry safely on flaky networks; jobs are uniquely keyed.
  • Structured output. Tables, headings, code blocks survive the conversion intact.
  • Strict file retention. Source files are never persisted; results expire after 5 days.
123456789101112
# 1. Submit a job
curl -X POST https://api.cobbling.ai/v1/jobs \
-H "Authorization: Bearer $COBBLING_KEY" \
-H "Content-Type: application/json" \
-d '{"type":"pdf","url":"https://x.com/q3.pdf"}'
 
# 2. Poll until ready
curl "https://api.cobbling.ai/v1/jobs/$ID"
 
# 3. Retrieve Markdown
curl "https://api.cobbling.ai/v1/jobs/$ID/result"
import { Cobbling } from "cobbling"
 
const co = new Cobbling({ key: process.env.COBBLING_KEY })
 
const job = await co.jobs.create({
type: "pdf",
url: "https://x.com/q3.pdf",
webhook: "https://my.app/cobbling"
})
 
// → { id, status: "queued" }
const { markdown } = await co.jobs.wait(job.id)
from cobbling import Cobbling
 
co = Cobbling(key=os.environ["COBBLING_KEY"])
 
job = co.jobs.create(
type="pdf",
url="https://x.com/q3.pdf",
webhook="https://my.app/cobbling",
)
 
# Block until complete (max 1h)
result = co.jobs.wait(job.id)
print(result.markdown)
{
"id": "job_8f2a91c4",
"status": "completed",
"input": {
"type": "pdf",
"pages": 4,
"size_bytes": 248192
},
"result": {
"markdown_url": "https://r.cobbling.ai/…",
"expires_at": "2026-05-07T14:02Z"
},
"cost_usd": 0.020
}
Key advantages

Built for the jobs that don't need to be instant.

Cobbling makes deliberate trade-offs: lower cost, stronger privacy, async-first. Three reasons that combination wins for AI ingestion work.

Cost-Efficient

$0.005 per PDF page, $0.0001 per web page. A run that costs $200 elsewhere costs $20 here.

PDF · $0.005 /pg WEB · $0.0001 /url
$0.05
$0.04
$0.03
$0.005

Privacy & Safety

Files are read once, parsed, and discarded. Results expire after 5 days — never resold, never trained on.

RETENTION 0s RESULTS 5d
×

Async by Design

Most jobs finish in 30s–10m. Long ones in under an hour. Built for batches, RAG indexing, and overnight ETL — not for chatbots.

P50 ~45s P99 <1h
[00:00]job.created
[00:02]queue.accepted
[00:14]parse.started
[00:38]job.completed ✓
[00:38]webhook.delivered
Use cases

Where teams are using Cobbling today.

— 01

RAG knowledge bases

Ingest PDFs, manuals, and policy docs into vector stores without paying premium per-page rates on tens of thousands of documents.

Vector · Indexing
— 02

Agent training data

Convert web articles and reference PDFs into clean, structured Markdown — the format every modern LLM expects at training time.

Fine-tuning
— 03

Web monitoring & scrape

Crawl public pages, screenshot the visible viewport, store Markdown for diffs. $0.0001 per page lets you watch entire industries.

Scrape · Diff
— 04

Document QA pipelines

Insurance forms, contracts, lab reports — preserve table structure for downstream extraction without bespoke parsers per format.

Extraction
Compare

A different shape than the premium tier.

If your job needs sub-second results, use a premium parser. If it doesn't, Cobbling will save you 90% on the same volume of work.

Premium parsers Cobbling Self-hosted OCR
PDF page price $0.04 – $0.06 $0.005 infra cost
Web page price $0.001 – $0.005 $0.0001 infra cost
Latency target < 5s, real-time 30s – 1h, async depends
File retention Varies — read T&Cs None Your servers
Setup time ~30 min, billing review ~5 min Days — model + infra
Free tier Trial credits 1,000 pages free

At Cobbling, our journey began as a dynamic startup team passionate about AI and technology. Over the years, we have launched several innovative products centered around large AI models, serving diverse needs in the tech landscape.

During our entrepreneurial venture, we identified a recurring challenge that many businesses, including ours, faced — accessing affordable conversion solutions for various document and web content. The available solutions were not only costly but also prioritized speed over affordability, making them inaccessible for startups and individual developers.

Realizing that not all scenarios demand rapid conversion speeds, we dedicated our efforts to develop a suite of tools that balanced functionality with affordability. We successfully crafted a series of solutions that, while not the fastest, offered significantly lower costs.

This initiative has already supported over 60,000 of our users, providing them with a stable, secure, and budget-friendly service. Encouraged by positive feedback, we decided to spin off these services as standalone products.

Developers served 60,000 +
Founded 2023
Headquartered Dover, DE
Pricing

Pay only for what you use.

One plan. A flat $5/month subscription that includes the first 5,000 PDF pages and 1,000 web pages. Beyond that, you pay per unit — no surprises, no enterprise tier wall.

Pro · the only plan

$5 /month

Includes 5,000 PDF pages & 1,000 web pages every month.

  • PDF & Image to Markdown $0.005 per page after the included 5,000.
  • Unlimited Webscoop $0.0001 per page after the included 1,000.
  • Webhooks & status API No extra charge. Run a job, get notified, retrieve.
  • Email support · 24h SLA Real humans, real replies, every business day.
Estimator
$5.00 /mo
8,000
3,000 over the included 5,000 × $0.005 = $15.00
5,000
4,000 over the included 1,000 × $0.0001 = $0.40
$5 subscription + usage vs premium ~$325.00 · save 94%
FAQ

Questions, answered honestly.

If you don't see yours here, email [email protected]. We answer within 24 hours.

01 How is billing calculated?
A flat $5/month subscription that includes 5,000 PDF pages and 1,000 web pages. Beyond that, PDF conversion is billed at $0.005 per page and web scraping at $0.0001 per page. No tiers, no minimums, no overage surprises.
02 What types of PDF files are supported?
Files up to 50 MB. We recommend a maximum of around 1,000 pages — there's no hard limit but more pages mean longer processing. Encrypted PDFs are not supported.
03 How long does PDF parsing take? Is real-time supported?
Asynchronous only — that's the design. Processing depends on file size and page count, ranging from 30 seconds to 1 hour, with most jobs finishing in under 5 minutes. We commit to delivering results within 1 hour.
04 How is data security and privacy ensured?
We read documents only during parsing — we never persist the originals. Converted results are stored for up to 5 days, after which they can no longer be accessed via the API. We never train on your data.
05 How are job errors handled?
Two paths: poll the task status API, or register a webhook to listen for job-state notifications. Failed jobs return a structured error code so you can retry programmatically.
06 What websites can be scraped?
Publicly reachable HTTP/HTTPS pages — articles, documentation, blogs, marketing pages, and most static or server-rendered sites. Pages requiring authentication, JavaScript-only paywalls, or non-standard protocols are not supported.
07 What about webpage screenshots?
Screenshots are saved for 5 days at a public URL. Currently, only the first (visible) viewport is captured. If you need to keep them longer, download and store them yourself.
08 What support is included?
Email support with a 24-hour response commitment. Write to [email protected] with anything — feature requests, bug reports, integration questions.

Start with 1,000 free pages.

No credit card required. Sign up, grab your API key, and convert your first PDF in under five minutes.