Convert PDF file to text with pdftotext

by
on
September 9, 2010

In many cases it can be helpful to access text from within a PDF file but accomplishing this can be next to impossible. Luckily in Linux there is a command line program called pdftotext which is included with the xpdf package.

This first step is making sure that the xpdf package is installed. In Ubuntu you can use the following command.
$ sudo apt-get install xpdf

Now you can convert a PDF to text with pdftotext. This code will output a file named <filename>.txt
$ pdftotext <filename>.pdf

You can also attempt to preserve some of the formatting within the PDF such as columns and spacing by using the “-layout” option.
$ apdftotext -layout <filename>.pdf

No Comments
commands
, , , , , , , , ,

Related posts:

  1. Convert text files within a directory from Windows to Unix format
  2. Echo text without a trailing newline
  3. Delete a specific line from a text file with sed
  4. Determine file type with the file command
  5. Convert a relative path to absolute path in BASH

Comments (0)

No comments yet

Trackbacks (0)

No trackbacks yet

Leave a Comment

(displayed with your post)
(will not be published)
(optional)

Copyright 2008-2010 WiredRevolution.com. All rights reserved.