How is rendering a Word document different from rendering a website?

Question

Now, it doesn't necessarily have to be Word — for ease of comparison, let's use ODT, which is based on XML — which is pretty similar to HTML. That would, to my mind, make rendering an ODT document almost like rendering an HTML website.

With ODT and HTML+CSS basically being two ways of describing a page's layout, what are the differences in rendering them?
Is it simply that HTML+CSS is more flexible and thus requires more complex rendering? A complicated website can have countless nested elements, all with relative positioning, custom styling etc. Compared to that, an ODT has a far simpler/more predictable structure, which I think should be easier to render.

score 4 · Answer 1 · answered Jan 14 '16 at 14:59

When your talking about rendering engines they are very different. For one thing, HTML documents have links to external resources, and are meant to present a way to navigate between pages. That's what "Hypertext" is. Word documents are meant to represent markup of a printed page. They are almost a typesetting tool.

HTML has to work and relay the information regardless of output device (screen, printer, screen printer, TTS, or others). A word document's output is either an emulated 8.5 x 11 page or a real one (or other sizes).

The very job of HTML and word documents is fundamentally different. It's basically trying to compare cars and boats. There are similarities, but there are way more differences.

bakoyaro · Answer 2 · 2016-01-15T15:22:26.313

Let's cut to the chase, we are talking about mark-up languages and how they are displayed in a browser.

HTML is data coupled with the instructions on how to display the data. Other technologies such as CSS and Javascript, can be used to make changes to the document after it is rendered in a browser.

XML is primarily data, generally without instructions on how to display the data. XSLTs, etc can be used in conjunction with the XML to display that data in a chosen format.

ODT is XML, but by extension, as well as properties, can be transformed from text and binary resources into a graphical display, much like a HTML document is rendered in a browser.

As with anything in CS, there will be exceptions, such as an API that or some other tool that can make changes that were not envisioned by the authors of the specifications.

Browsers are designed to take HTML (text), based on the extension type, and turn that into a graphical display of those text and binary resources.

Most browsers are designed to take XML, also based on the extension type, and display a hierarchical tree of the data. That is where transformations such as XSLTs come in, they are designed to take data in a specific format, and then transform the data into something else; HTML, text, more XML, etc.

XML is primarily concerned with storing the data, by design there aren't any instructions embedded in XML that define how the data should be displayed. Custom XML schemas sometimes throw this idea right out the window and mix XML with custom elements and attributes in order to create their own markup language variant, for their own custom interpreter, such is the case with ODT and other open document types.

Since ODT is also XML, based on the extension a browser could process the data using a specific set of instructions.

Check out these links for more information on the HTML and XML specifications:

Link to HTML 5 Specification at W3C

Link to the XML Specification at W3C

Jon Raynor · Answer 3 · 2016-01-14T20:24:01.600

HTML documents have pre-defined tags, whilst XML does not. Because the tags are defined, browsers can be made to render the display.

A <BODY> tag has a specific meaning in HTML and is treated as such.

Now consider this XML fragment:

<BODY>
 <ARM></ARM>
 <EYECOLOR></EYECOLOR>
</BODY>

The <BODY> tag has a different context and thus is treated differently than an HTML tag.

As indicated by @MichealIT, ODT also has specific defined format for it's tags, so any software or browser will need to adhere to that definition in order to render the document. There are probably many differences between an HTML document versus an ODT one.

How is rendering a Word document different from rendering a website?

3 Answers3