pdf2wordx uses theDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/dev2forge/pdf2wordx/llms.txt
Use this file to discover all available pages before exploring further.
pdf2docx library (version 0.5.8) to parse PDF structure and reproduce it as an editable Microsoft Word document. The pdf2docx library reads the PDF’s internal page data — including text blocks, images, and layout geometry — and maps them to Word-compatible constructs stored in the .docx Open XML format. The conversion is initiated from within the Funcs class in files/functions.py and runs asynchronously so that the Tkinter interface stays fully responsive while a potentially large document is being processed.
Step-by-step conversion
Set your output filename
When the app launches, the filename entry field is already pre-populated with
document-pdf2wordx. You can leave this default or clear the field and type your preferred name.Do not include the .docx extension — it is appended automatically. Whatever text is in the entry field at the moment you click “Choose Directory” becomes the final filename. For example, typing my-report will produce my-report.docx.Click "Abrir Archivo" (Open File) to select a PDF
Click the yellow-green “Abrir Archivo” button to open the native OS file picker. The dialog is pre-filtered to show only After you select a valid file:
.pdf files:- The PDF’s base filename is extracted with
os.path.basename()and stored inself.file_name_original. - The “Elegir Directorio” button is enabled automatically.
- The “Archivo PDF:” info label at the bottom of the window updates to show the selected filename.
Click "Elegir Directorio" (Choose Directory) to pick an output folder
This button becomes active only after a PDF has been selected. Clicking it triggers two sequential actions:
- The current text in the filename entry field is read and combined with the
.docxextension to form the output filename: - A native OS directory picker opens, titled “Busca la ruta de salida del archivo”. The full output path is constructed by joining the chosen directory and the output filename:
- The “Convertir” button is enabled.
- The “Archivo De Salida:” info label updates to show the configured output filename.
Click "Convertir" (Convert) to start the conversion
With both a source PDF and an output directory configured, click “Convertir” to begin. Two dialogs appear in sequence:
- Info dialog — immediately shows:
"Convirtiendo <full path to pdf>", confirming that the process has started. - Success dialog — once the conversion finishes, a second dialog confirms:
"Se ha convertido el archivo <filename> exitosamente".
Under the hood
The core conversion logic lives in theFuncs._convertFile() async method inside files/functions.py. When the Convert button is clicked, App.convertFile() in _pdf2wordx.py launches it inside a dedicated background thread wrapped in asyncio.run():
_convertFile(), the pdf2docx.Converter class is instantiated with the path of the selected PDF, convert() is called with the full output path, and then the converter is closed cleanly:
threading.Thread ensures the Tkinter main loop is never blocked, keeping the window responsive while pdf2docx processes the document.
Output file format
File type
The output is always a
.docx file — the standard Open XML format used by Microsoft Word and compatible editors such as LibreOffice Writer and Google Docs.Filename
The filename is derived from the entry field value at the time “Choose Directory” is clicked, with
.docx appended automatically. For example, report → report.docx.Save location
The file is written to the exact directory path selected via the directory picker. The full path (directory + filename) is visible in the “Directorio De Salida:” label after conversion.
File size
Output
.docx files may be considerably larger than the source PDF, particularly when the PDF contains embedded images or complex vector graphics.After conversion
Once a conversion completes successfully, theFuncs._disableButton() method disables both the “Elegir Directorio” (Choose Directory) and “Convertir” (Convert) buttons:
If an error occurs during conversion (for example, a corrupted PDF or an inaccessible output directory), a
showerror dialog is displayed and the error is logged to ./src/pdf2wordx/log.log via the chromologger logger. The buttons are not disabled in this case, so you can fix the issue and try again without restarting.