| ||||||
XML Workflow for Publishers, Part - 101, 2009By Dr. Brijesh Kumar, Digital Media Initiatives Text processing is a critical activity for any publisher which helps in automating parts of document creation and publishing process. The first wave of automated text processing was computer typesetting. The same file contained content and the rendition details. This rendition then would be converted to a presentation (typically a paper as a medium). Typesetting soon progressed to Desktop Publishing and programs such as MS Word or Adobe PageMaker were evolved. These programs still worked with renditions, but provided users with nicer interface to manipulate them. The User Interface to the rendition looked like presentation, and hence was called, WYSIWYG (What You See Is What You Get) publishing. Formatting Markup (marking up with renditions) gradually evolved to Generalized Markup of documents, since more than merely rendering into presentations, documents could be put to more use in a variety of ways, such as single-source multiple-channel publishing and searcheable documentation databases. Single-Source Multiple-Channel Publishing is a form of publishing when once the content has been created, it could be rendered in multiple formats and could be published through multiple channels - either in paper or digitally (the Web and eBooks). Searcheable Documentation Databases which are cross-indexed and hyperlinked is a reality today that is structured with content management systems and reuse of chunks of information as and when it is required. There are two key XML vocabularies currently in vogue for authoring technical documentations: 1. DITA: Darwin Information Typing Architecture There are interest groups who are debating on for-and-against DITA or DocBook to be used for publishing any type of documents or books. There are expert arguments that DITA can point to four technical differences with DocBook arguably featured in its favor [1] and that perhaps DocBook can make up for them. Others argue that DITA is more powerful than DocBook, which may eventually fade [2]. Well, I suggest that this debate may be discussed in yet another post, and we shall now discuss about DocBook XML as a potential vocabulary for a publishing workflow. The first question is why DocBook XML for a publishing workflow? DocBook XML is a semantic markup language for technical documentation. Originally it was created for technical documentation, however, it is flexible enough and could be used for any kind of published documentation. The set of markup tags let a user create content in a presentation-neutral form, which could then be rendered in any other format, such as HTML, PDF, or yet another XML without making any changes in the contents. DocBook XML qualifies on both fronts - its use as a single-source multiple-channel publishing and in the seracheable documentation databases. It doesn't matter if everyone writes in DocBook, but as long as it becomes the common document interchange format that everyone uses, we'll still get unified searchable documentation databases. Authors may write in multiple formats - Notepad text files, Word doc files or simple HTML files - those formats can easily be converted in to DocBook XML, which in turn could be transformed into various formats. Publishers need to explore if the Quark Shops or InDesign could receive DocBook XML as a source file. When this is possible, all new avenues to digital publishing are open to a publisher. We shall continue this discussion in my next posts. ---------1. http://norman.walsh.name/2005/10/21/dita 2. http://times.usefulinc.com/2006/01/23-dita |
|
|||||
|
||||||