[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [docbook-apps] Word 2007+ to DocBook
Hi Greg, as an alternative to the rountrip/.docx route, I have found a __very__ satisfactory MSword to docbook solution by using the fantastic macro, written by Michal Kebrt , you can find at http://wordtolatex.sourceforge.net. The idea is to transform a plain old word doc file to a custom intermediate "flat" xml and then transform this to docbook using a fairly simple XSLT (around 250 lines, not optimized) . The intermediate xml can have a flat structure like this (__much__ more simple and readable than the docx format): <para><hdg1>chapter title</hdg1></para> <para><style name="wordStyle1">a para</style></para> <para><style name="wordStyle2">a para</style></para> <para><style name="wordStyle3">a para</style></para> <para><hdg2>section title</hdg1></para> <para><style name="wordStyle3">a para</style></para> <image fileref="img43.png" width="168" format=""/> <para><style name="wordStyle3">a para</style></para> <table>....</table> ... ... Note that the only tag at the first level are (para|image|table)* to keep the XSLT for trasforming to docbook fairly simple. You can now use sibling relationships to transform to a docbook nested structure based on the inner style tags. Obviously you have to first cleanup the input word file (apply/remove/rework styles as needed). Word2Latex has a nice configuration file (and GUI) to map the various standard word structures to custom xml. I have not tested this method with complicated nested lists and the like (and I do not claim that this method can convert __any__ word document!), but it works surprisingly well with all the normal word structures (footnotes, index entries, custom para stiles, etc.). We use this method to routinely convert non technical books exported from XPress/InDesign. If you are interested, I can send to you my DTD to validate the intermediate xml and an xslt for going from the intermediate xml to docbook. This method works only under Windows (you need Microsoft OLE automation) and probably is not as general as the docbook roundtrip xslt, but, in my view, has the advantage of flexibility and simplicity: with a single configuration you can directly map custom Word styles to docbook role attributes of paras (and you pay this flexibility with the need of a pipeline of 2 xslt). I have made some test with Openoffice "Save as docbook", but the results were less than satisfactory, at least for me (the docbook OO xslt seems quite old and apparently not actively mantained). Regards, __peppo On Thu, May 6, 2010 at 10:29 PM, <gpevaco@aol.com> wrote: > Howdy DocBook Community: > > I am new to DocBook, and also new to this forum. I have been going through > the archives, and found some very interesting discussions. Primarily I am > interested in moving/converting some documents from Word which they were > authored in to DocBook.
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]