Python Docx is a python library for creation and modification of Microsoft Word documents. It offers a variety of operations to create new documents and other word operations like working with text, images, shapes, tables and many other document features. New document can be created and existing documents can also be modified using python docx. For getting started, first install python docx on your system using pip or source.
# using pip pip install python-docx # using easy_install easy_install python-docx # or build from source tar xvzf python-docx-.tar.gz cd python-docx- python setup.py install
Now we can work with basic of python docx for word document creation.
# import docx document from docx import Document # initialize a document document = Document()
Or if there is some existing document, it can also be opened using Document() in python docx by providing path of document.
document = Document(doc_path)
Next we work with python docx functions to add data to document.
Python-Docx offers different options like paragraph, heading and other options for simple text. Headings are paragraphs with different text size and style based on its level defined while creating heading. Heading level ranges from 0-9 based on text size where 0 is biggest font heading. Here are some examples of headings.
# title heading document.add_heading("This is a level 1 heading", 0) # Add other heading levels document.add_heading("This is a level 2 heading", 2) document.add_heading("This is a level 3 heading", 3) document.add_heading("This is a level 5 heading", 5) document.add_heading("This is a level 7 heading", 7) document.add_heading("This is a level 9 heading", 9)
Paragraph has different properties depending on its placement and it divides content accordingly to its lines. Paragraphs has different style and alignment options to create a document with specified text locations and styles.
paragraph = document.add_paragraph("TensorFlow is a free and open-source software library for machine learning and artificial intelligence.")
Paragraphs can be updated/modified with new text or alignment options.
# add more text paragraph.add_run(" It can be used across a range of tasks for ") # add text with styles paragraph.add_run('training model ').bold = True # added text with bold paragraph.add_run('and inference.').italic = True # added italic text
Paragraphs can have other styles like quotes and other styles.
document.add_paragraph('Intense quote', style='I have no special talent')
Paragraph alignments like horizontal alignment, indentation and other features like line spacing can also be applied to paragraphs. First lets work with horizontal alignment.
from docx.enum.text import WD_ALIGN_PARAGRAPH # Check previous alignment print("Previous alignment", paragraph.paragraph_format.alignment) # Align paragraph center paragraph.paragraph_format.alignment = WD_ALIGN_PARAGRAPH.CENTER
Indentation is horizontal space between paragraph and its container edges. In python-docx we can specify details in Inches so we import function and can indent in direction.
from docx.shared import Inches paragraph_r = document.add_paragraph('This is some random paragraph for testing indentation on both (left and right) side of paragraph.') # only first line indent paragraph.paragraph_format.first_line_indent = Inches(0.5) # paragraph indent paragraph_r.paragraph_format.left_indent = Inches(0.5) # apply 0.5 inch left indentation paragraph_r.paragraph_format.right_indent = Inches(1) # apply 1 inch right indentation
Line spacing is also part of paragraph formating and is easy to use.
from docx.shared import Pt paragraph.paragraph_format.line_spacing = Pt(18)
Different font styles like color, font family and other modifications can also be applied. Here we create a paragraph and change its font size and font family.
# font modifications para_font = document.add_paragraph() run = para_font.add_run('This is a paragraph with different font styles.\n') run.font.size = Pt(14) # font size run.font.name = 'Courier New' # font name
We can also apply colors and other attributes like bold, italice, underline etc.
from docx.shared import RGBColor # add underlined text with blue color url = para_font.add_run("http://google.com") url.font.color.rgb = RGBColor(0x00, 0x00, 0xFF) url.font.underline = True # font underline
Here is the output for all the code we have written for paragraphs. Now we can export document and write it to directory.
# save document document.save("paragraphs.docx")