DESCRIPTION:
Wrapper for the wv2 library: parses Microsoft Word files. So far it fires callbacks to a TextHandler, SubDocumentHandler, TableHandler and InlineReplacementHandler if any of these are registered with the Parser.
TODO:
Rwv2 does not yet support the full set of Wordfile-Properties. Notably missing are:
Font Family Name (FFN)
Tab Descriptor (TabDescriptor)
Word-internal Date and Time (DTTM)
Shading Descriptor (SHD)
Paragraph Height (PHE)
Border Code (BRC)
Table Autoformat
Autonumbering
and many more - I’m taking the YAGNI (you aren’t gonna need it) approach to most of these, if you actually do need one of them or any other feature let me know...
wvWare writes errors and warnings and infos directly to std::cerr - this can possibly be caught by replacing cerrs buffer. The tricky thing then is to raise/warn/ignore according to the buffers content, probably within a separate thread...
Some of the testing is unclean, as I’ve only tested with OpenOffice-exported Wordfiles
Documentation
USAGE:
require 'rwv2'
require 'rwv2/handlers'
class TextHandler < Rwv2::TextHandler
def run_of_text(text, character_properties)
puts text
end
end
parser = Rwv2.create_parser('test/data/test2.doc')
parser.set_text_handler(TextHandler.new)
parser.parse
REQUIREMENTS:
libwv2
tested with ruby 1.8, let us know about other versions!
INSTALL:
sudo gem install rwv2
DEVELOPERS:
Masaomi Hatakeyama
Zeno R.R. Davatz
Hannes Wyss (up to Version 0.6.0)
LICENSE:
GPLv2.1
OTHER
wvWare was written by Caol�n McNamara and is currently (22.8.2003) maintained by Dom Lachowicz