[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: Re: DOCBOOK-APPS: Problem converting DB to PDF...
Dan, On Thu, Feb 08, 2001 at 05:28:38PM -0500, Dan York wrote: > However, because we also would like to have a version available that can > be easily printed, I have been trying to generate a PDF or PostScript > file. So far, I have been unsuccessful. The two major problems are: > > 1. Graphics do not appear in the PDF file. They are implemented as > <mediaobject> in the DocBook file. I've been meaning to write this for a while. This is my guide to including images in DocBook as painlessly as possible. Assuming that you want to create, as a minimum, HTML, PS, and PDF documents with the best quality images, you need to do the following. First of all, you need to choose your preferred image format(s). This is not as simple as simply picking a single format. The difference between bitmap and vector based image styles means that one single format won't suffice. Instead, you need to pick a format that's good for bitmap images, and a format that's good for vector images. The rest of this document assumes PNG for bitmaps, and EPS for vector. There's another wrinkle. In my experience, PDF generation works best if you pass pdftex the name of a .pdf file, and not a .eps file. However, EPS files can still be the source from which the PDF is generated. If you look at DocBook's image inclusion support, you'll see the <mediaobject> element, which can contain one or more <imageobject>s. The original idea was that, for each image, you would have one <mediaobject>, which would contain several different <imageobject>s, each pointing to a file in a different format. The stylesheets would then select which <imageobject> to use. This is the approach taken in Norm Walsh's stylesheets. However, in my experience, this doesn't work quite right, and I had difficulty getting the stylesheets to always select the correct <imageobject> element to use. A better approach has been to never include the filename's extension in the <imageobject> element's attributes, and let the stylesheets add the extension, or not, as necessary. A useful side effect of this is that you only ever write one <imageobject> per <mediaobject>. So, some sample markup might look like this: <mediaobject> <imageobject> <imagedata fileref="image"> <!-- Filename, without extension --> </imageobject> <textobject> <phrase>An image</phrase> </textobject> </mediaobject> Of course, this assumes that you have image.{png,eps,pdf} in the current directory as well. If you want to convert a document containing this to HTML, you need to use a stylesheet that customise's Norm's sheets, and has (define %graphic-default-extension% "png") in it. HTML is easy in this respect. PS and PDF are a little more complicated. Again, you need to use a customisation of Norm's stylesheets. the following two functions (these work at least up to v1.61 of Norm's sheets). These are re-writes of Norm's functions. -------- 8< -------- 8< -------- 8< -------- 8< -------- 8< -------- ; Norm's sheets try and work out which one of the <imageobject>s ; should be used. However, we only ever have one, so just use ; the first one. ; ; XXX This can probably be made more efficient by dropping the let* ; clause. One day, I'll get around to testing that. (define (find-displayable-object objlist notlist extlist) (let loop ((nl objlist)) (if (node-list-empty? nl) (empty-node-list) (let* ((objdata (node-list-filter-by-gi (children (node-list-first nl)) (list (normalize "videodata") (normalize "audiodata") (normalize "imagedata")))) (filename (data-filename objdata)) (extension (file-extension filename)) (notation (attribute-string (normalize "format") objdata))) (node-list-first nl))))) ; This function, given a graphic filename, looks at the filename's ; extension, and appends %graphic-default-extension% as necessary. ; ; However, given a bare filename (such as "image") TeX is perfectly ; capable of adding the .eps or the .pdf as necessary. Rather than ; try and second guess TeX, don't do anything if the tex-backend ; variable is set. (define (graphic-file filename) (let ((ext (file-extension filename))) (if (or tex-backend ;; TeX can work this out itself (not filename) (not %graphic-default-extension%) (member ext %graphic-extensions%)) filename (string-append filename "." %graphic-default-extension%)))) -------- 8< -------- 8< -------- 8< -------- 8< -------- 8< -------- You can see this in the FreeBSD customisation layer, at http://www.freebsd.org/cgi/cvsweb.cgi/doc/share/sgml/freebsd.dsl We keep both HTML and Print customisation layers in one file. You can do the same thing, or use two different files if you want. OK, so suppose you have, in one directory, the following files: doc.sgml Your document html.dsl Your customisation layer for HTML docs, which sets %graphic-default-extension% print.dsl Your customisation layer for print docs, which contains the two functions listed earlier. image.png PNG image image.eps EPS image image.pdf PDF image what are the command lines you need to use? As I said, HTML is easy. jade -c /your/path/to/the/catalog/files \ -d html.dsl \ -t sgml \ doc.sgml Add things like "-Vnochunks" or whatever, depending on your preference. This should have used the PNG images. PS is also relatively simple. jade -c /your/path/to/the/catalog/files \ -d print.dsl \ -Vtex-backend \ -t tex \ -o doc.tex \ doc.sgml Notice that you have to give the "-Vtex-backend" option. I've also shown the use of -o to explicitly set the output file name. You can then run tex "&jadetex" doc.tex a few times, to generate the .dvi file, and then convert the DVI file to PS. PDF is a little more complex. As shipped, when producing PDF files, teTeX will prefer to include a .png file over a .pdf file. I don't why that is. The way to work around this is to make sure that the line \catcode`@=11\def\Gin@extensions{.pdf,.png,.jpg,.mps,.tif}\catcode`@=12 appears at the start of the .tex file, before you process it with pdftex. There are many ways in which you can do this. Anyway, your command line for generating PDF should look like this; jade -c /your/path/to/the/catalog/files \ -d print.dsl \ -Vtex-backend \ -t tex \ -o doc.tex \ doc.sgml As you can see, this is the same command line as for generating PS output. Once you've run this to generate doc.tex, edit doc.tex, and insert the earlier "\catcode..." line at the start of the file. Then you can run pdftex "&pdfjadetex" doc.tex a few times, to generate the .pdf file. That's that, pretty much. You can see BSD style .mk files that implement all of this, at http://www.freebsd.org/cgi/cvsweb.cgi/doc/share/mk/ and pay particular attention to doc.docbook.mk. I'm not aware of any Linux distributions (with scripts like dbtopdf) that make this level of customisation possible. Of course, it would be trivial for you to implement your own replacement scripts which do this. The FreeBSD make(1) approach is, of course, there for the taking, and I'm happy to discuss it further on either this list, or doc@freebsd.org. <columbo>Oh, and one more thing.</columbo> As you might be aware, you can use w3m (a text mode browser, with support for tables) to provide very good DocBook -> Text, by first going DocBook -> HTML (as one big file), and then using w3m to convert the HTML to plain text. Wouldn't it be neat if you could include ASCII art in your document as well, such that when you were going to produce plain text, instead of getting the ALT text on the image, you got ASCII art instead? Well, you can. First, suppose that your markup looks like this <mediaobject> <imageobject> <imagedata fileref="image"> <!-- Filename, without extension --> </imageobject> <textobject> <para>+-----+ | A | +-----+</para> </textobject> <textobject> <phrase>An image</phrase> </textobject> </mediaobject> (assume, for the moment, that your image is of a box with the letter A in it). The HTML stylesheets will search and make sure that the file image.%graphic-default-extension% exists. If the image doesn't exist then the stylesheets will use the contents of the first <textobject> instead. So if you run something like jade -c /your/path/to/the/catalog/files \ -d /path/to/nwalsh's/html/docbook.dsl \ -t sgml \ -Vnochunks \ doc.sgml > doc.html w3m -T text/html -S -dump doc.html > doc.txt Then you will have doc.txt that contains your ASCII art instead. This works because Norm's sheets, by default, do not define a value for %graphic-default-extension%. In case he ever changes this, you might want to create another HTML stylesheet which explicitly sets the variable to #f. For an example of this in action, take a look at http://www.freebsd.org/cgi/cvsweb.cgi/doc/en_US.ISO_8859-1/article/vm-design and examine the files in there. > 2. More importantly, outside of the missing graphics, the PDF file > looks fine for the first 19 pages, until it gets to Chapter 4. In > this chapter, I really just have the following construction: > <sect1> > <title></title> > <table> > .... > </table> > <table> > .... > </table> > </sect1> I've downloaded your document, created a bunch of test images (I'm on a 56K dialup at the moment), and can't replicate this on FreeBSD, using Jade 1.2.1, JadeTeX 2.2, and teTeX 1.0.7. N -- Internet connection, $19.95 a month. Computer, $799.95. Modem, $149.95. Telephone line, $24.95 a month. Software, free. USENET transmission, hundreds if not thousands of dollars. Thinking before posting, priceless. Somethings in life you can't buy. For everything else, there's MasterCard. -- Graham Reed, in the Scary Devil Monastery
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC