Sunday, April 25, 2010

What is a web-based document?

Apple’s iPad has certainly stirred up discussions about ebooks again. It seems every time a new ebook reader is introduced (like the Kindle), the talk about ebooks hots up – either ‘the book is dead’ or ‘you can’t curl up in bed with a laptop’ depending on which side of the digital divide you’re arguing for. And with that talk people question the format of ebooks. The latest is that epub will be ‘the’ format for ebooks. Have a look at http://www.abc.net.au/rn/breakfast/stories/2010/2815124.htm for instance.

Being a novice to the whole ebooks idea – I don’t have a reader or even an iPhone – I wonder where the epub format will leave PDF. I think PDF is great. I have downloaded a few PDF ebooks because I read them on my laptop at home (where I have the time to read) – the wide colour screen is convenient for the page size and readability. Others think PDF is great. I have prepared a few reports as PDF for clients, and they then put those publications on their website as downloadable documents. PDF is one of the main formats of what a web document is right now. Readers can download it and read it onscreen or print out relevant pages – it’s such a convenient format. I haven’t come across anyone who has asked for a HTML version of their document. Probably because it means creating an entirely new document and generating graphics, text and stylesheets to display the same content as a PDF (and who wants to pay twice for two different formats of the same document?).

On to the other format – I have tried to read HTML documents and given up in frustration. Well, okay, I once took a graduate course where all the course notes were provided as hyperlinked pages, and I ended up copying and pasting each page into Word. Not my only experience at trying to download web pages, mind you. HTML documents are designed to be read while you are online, and you might not be able to easily save an entire publication unless the creator has been kind enough to provide a downloadable zip file of all pages. If this is the best that HTML has to offer in the world of reading – despite the attractiveness of being able to handle rich media content – then I can see it dying a natural death as an ebook format. The very structure of HTML pages is a turn-off because I can’t download a HTML document and read it at my leisure. Offline. When I’m not paying big dollars for my broadband access.

I really look forward to seeing epub in action, especially if I can download content then view it offline. I may even think about getting an iPhone…

2 comments:

  1. Hi Dave,

    The big difference between PDF and HTML/ePub is reflowability. That just means the ability to adjust the text displayed to the size of the device or window you're reading on. If you're reading a PDF, you can enlarge or reduce the size of the whole page, and you can change the size of the window you're reading in, but you can't reflow the text, that is, make the lines of text shorter or longer, with the lines breaking wherever they need to and the document repaginating in order to fit the display. This is fine on a good-sized computer monitor where you can display the whole page (or at least the full width of the page) and still read the type, but it's not good on an ebook reader. I have put a few PDFs on my Sony ebook Reader, and I am forced to display full pages, which on my Sony Reader are about 3 inches by 5 inches. I can't read the type because it's so small.

    On the other hand, if you're reading an HTML document in an application in a window on your computer, you can change the size of the window and the text will reflow to fit the window. It doesn't change the size of the type to fit the window unless you want it to. You can also leave the window size unchanged and enlarge the type size if the default is too small. This is also what you do when reading ePubs (which are really HTML internally) on small devices, such as ebook readers and on phones. Most of these devices let you change the size of the type, and the line length changes and the book repaginates. You can view the same ePub file on devices with different displays, and the text will flow to fit the device. You can also download the ePub, disconnect from the Internet, and read the ePub.

    In short, PDFs are great for printing and for viewing full size on a computer screen, but on smaller devices, the flexibility of HTML/ePub has a lot of advantages.

    I'll send you an example of a document formatted as a PDF and as an ePub so you can see the difference. There are lots of applications you can use to read ePubs on computers, Adobe's Digital Editions being one well-known example.

    Cheers
    John

    ReplyDelete
  2. Great, thanks John.

    My main bugbear about HTML-based documents was not being able to easily download all web pages. I know there is software that will find all pages on a website and download them to your PC in a structured manner - I just haven't tried this.

    So, when I said HTML will die a natural death, I really meant HTML is a good format for displaying content, but not so good to download. (Why didn't I just say that? Don't ask.)

    I won't spend time on the net reading - I only use it to browse for things I want to read. So now you've got me interested when you say epub books can be downloaded and read offline.

    ReplyDelete