Skip to main content
MkDowner supports a wide range of file formats for conversion to Markdown, powered by Microsoft MarkItDown.

Documents

PDF Files

.pdf - Portable Document Format files with text extraction

Word Documents

.docx - Microsoft Word documents (Office Open XML)

Text Files

.txt - Plain text files

Rich Text

.rtf - Rich Text Format documents

Presentations & Spreadsheets

PowerPoint

.pptx - PowerPoint presentations

Excel

.xlsx - Excel spreadsheet files

CSV

.csv - Comma-separated values

Web & Data Formats

HTML

.html - HTML web pages

JSON

.json - JSON data files

XML

.xml - XML documents

Images & Media

Images

PNG, JPG formats with OCR text extraction

Audio

.wav, .mp3 - Audio files for transcription

File Size Limitations

Individual files must be 24MB or smaller to ensure optimal processing performance.
src/App.tsx
{!showSuccess && (
  <div className="hero-upload-footer">
    Supports PDF files up to 24MB
  </div>
)}

Format Categories

Formats are organized into four main categories in the UI:
src/components/SupportedFormats/SupportedFormats.tsx
<div className="supported-grid">
  <div className="format-category">
    <h4>Documents</h4>
    <ul className="format-list">
      <li>PDF files (.pdf)</li>
      <li>Word documents (.docx)</li>
      <li>Text files (.txt)</li>
      <li>Rich Text (.rtf)</li>
    </ul>
  </div>

  <div className="format-category">
    <h4>Presentations &amp; Spreadsheets</h4>
    <ul className="format-list">
      <li>PowerPoint (.pptx)</li>
      <li>Excel files (.xlsx)</li>
      <li>CSV files (.csv)</li>
    </ul>
  </div>

  <div className="format-category">
    <h4>Web &amp; Data</h4>
    <ul className="format-list">
      <li>HTML files (.html)</li>
      <li>JSON files (.json)</li>
      <li>XML files (.xml)</li>
    </ul>
  </div>

  <div className="format-category">
    <h4>Images &amp; Media</h4>
    <ul className="format-list">
      <li>PNG, JPG images</li>
      <li>Audio files (.wav, .mp3)</li>
      <li>OCR text extraction</li>
    </ul>
  </div>
</div>

Backend Requirements

MkDowner relies on Microsoft MarkItDown for conversion:
The backend must have MarkItDown installed and properly configured. The conversion engine handles format detection automatically.

API Endpoint

The backend exposes a single upload endpoint:
POST /upload
Content-Type: multipart/form-data

Response Format

  • Single file: Returns .md file directly
  • Multiple files: Returns .zip archive containing all converted Markdown files

Format-Specific Notes

Text is extracted from PDFs. Image-based PDFs may require OCR capabilities on the backend for best results.
DOCX, PPTX, and XLSX files are parsed for structure and content. Formatting is converted to Markdown equivalents where possible.
PNG and JPG images can have text extracted via OCR. This feature requires additional backend configuration.
WAV and MP3 files can be transcribed to text. Transcription capabilities depend on backend setup.
HTML is converted to clean Markdown. JSON and XML are formatted as structured text.

AI-Enhanced Conversion

MkDowner uses Microsoft MarkItDown with AI-enhanced conversion for better formatting preservation and structure recognition.
This enables:
  • Intelligent table detection and conversion
  • Header hierarchy preservation
  • List structure recognition
  • Code block formatting
  • Link and reference handling

Build docs developers (and LLMs) love