------------------
Apache Lucene Tika
"Toolkit for detecting and extracting metadata and structured text
content from various documents using existing parser libraries."
For more information:
http://lucene.apache.org/tika/