Simplicity is the ultimate sophistication

Page Rendering Techniques on iPad Magazines

| Comments

Historical overview

One of the great improvements in all iPad owners lifestyle is the possibility to bring everywhere any sort of magazine or book, thanks to the screen size and the device light weight which both facilitate reading and carrying. In particular reports demonstrated that in a printed publications decreasing market there is a huge increase in the number of subscriptions to the digital versions of the same product (the interested reader can read this report from MPA).

Apple is following this trend with great interest, and this is quite clear if we take a look at the evolution of the iOS features that have been introduced since the release of the version dedicated to iPad, that is 3.2. In the particular the milestones that have been reached are three, shared between three major releases of the operating system:

  • iOS 3.2 was enriched by the CoreText framework, a technology dedicated to rendering text on display available since long time on Mac OSX and never ported in the earlier versions of the iPhone OS.
  • iOS 4.x introduced the concept of autorenewable subscriptions, as an addition to standard non consumable In App Purchases; this feature has been introduced after long discussions between Apple, that applies the 30% commission on every In App sale and forbids any other external cheaper store access within its devices, and the publishers looking for customer fidelity techniques.
  • finally iOS 5.0 added the Newsstand feature, which provides a central place to collect all magazine and newspaper apps and at the same time provide a nighttime content push to all subscribers, letting them to immediately read the latest issues of their publications and saving them for the extra time (sometimes long) required for the download.
  • as far as iOS 6, my developer agreement with Apple forbids me, while the new OS version is still under beta, to disclose any news; this paragraph will be updated this autumn…

What Apple didn’t provide instead is a common and unique developer platform dedicated to the creation of apps dedicated to the magazine consumption. This lead to a lot of initiatives dedicated to help publishers to enter in the iPad market with their own magazines. These initiatives were taken by major and well known companies, such Adobe with its Digital Publishing business, and a lot of many startups, everyone with its own solution.

As I said, Apple doesn’t provide a unique solution, but developers have the availability of a set of frameworks and techniques, with different levels of complexity, that provide different way of representing the page on the screen. There is not an optimal choice, as the final decision needs to take care of aspects that go beyond pure technical considerations.
In this article we will try to depict these solutions mainly from the app developer point of view, but will never forget to enumerate the pro and cons that can affect the publisher decision on which technology to adopt.

Page rendering overview

We assume that you, the developer, are in a certain point of your app development where the magazine has been purchased, downloaded and it’s ready to be read. Your document data at this point is safely stored in the device file system and it can be represented by a single pdf file, or a collection of html and css files or a directory containing assets of different formats, such as images, videos, html5 widgets, text files. You’re now facing the problem of taking one page (which can extend beyond the screen boundaries) and presenting it in the empty space of your UIView dedicated to the page rendering.

In the next paragraphs I will present the following methodologies to achieve this result:

  • pdf document rendering
  • pre-rendered image display
  • free format CoreText rendering
  • web based approach

The magazine is a PDF file

You may like it or not, but should your software house be committed to develop a magazine iPad app, the magazine will be with high probability given to you as a PDF file. As there is no way to escape from it, at the end you will need to develop your own pdf reader or integrate some free or commercial external library. The reason why pdf is still the dominant format in the e-publishing world is clear: most of the publishers are porting their existing printed publications on the iPad, and for obvious budget reasons they want to reuse all the investment done in the creation of their issues. You will not be able to escape from the pdf format dictatorship with the exception of two cases: the publication is brand new and only digital, so there are no previous investments to drive the final choice, or the publisher has large budgets and/or is a strong user experience (UX) believer and accepts to allocate the extra budget to recreate a different format for its publications. Both cases are not so uncommon with those publishers that already did the effort to bring their products to the web (with the notable exception of those that did it in Flash!), but the large part of the small and medium publishers will still be locked to the pdf format.

Unfortunately the pdf is not the best way to port a magazine in the iPad. And this for several reasons:

  • printed magazines page size is usually larger than the iPad screen: this means that when the page fits to the screen, all characters appear smaller and then something readable in the printed paper could become unreadable without zoom; but zoom is not always efficient and in particular it’s not loved by readers that may lose their “orientation” inside the page.
  • printed magazines pages have not the same aspect ratio of the iPad screen: this means that a page that fits in the screen will be bordered by top/bottom or left/right empty stripes.
  • often printed page layouts are optimized for facing pages, e.g. a panorama picture which is spread between two pages; when the device is kept in portrait orientation, these graphical details will be lost, instead if the device is kept in landscape you will be able to appreciate the two-pages layout but characters will be too small to be read comfortably.
  • as these files are not optimized for digital, normally the outlines (table of contents) and annotations (links to pages or external resources) are not exported; this means that even if your pdf reader code is aware of this information, in the majority of cases it is not available and then you will need to define a different way to provide it.
  • the official pdf format supports multimedia content; unfortunately the iOS is not able to manage it, so all interactive content must be provided outside the pdf file.

The page rendering is achieved in iOS (and OSX too) through the Quartz 2D API, provided within the Core Graphics framework (shorted with CG). Quartz 2D is the two-dimensional drawing engine on which are based many (but not all) of the drawing capabilities of iOS. The PDF API is a subset of the huge CG API. This API is “old fashioned” and is not based on Objective-C but on pure old C; besides all memory management rules will follow the Core Foundation (CF) rules which are different from Obj-C one: this means that special attention must be provided to avoid memory leaks - or bridging if you use ARC -, as each PDF page manipulation can take several megabytes and leaks will easily trigger the memory watchdog, thus force quitting your app. In any case if you are used with Quartz 2D concepts it will be immediate to render a PDF page, by following these basic steps:

  1. get the CG reference to the pdf page to be drawn;
  2. get the current graphics context for the view that will contain the page;
  3. instruct Quartz to draw the pdf page to the context.

As you can see, apart the required steps needed by the drawing model of Quartz, the full rendering is accomplished by the system and you don’t need to have any knowledge of the data format of a pdf file. So for you the pdf rendering processor is just a black box, and this is clear when you see that all CG data structures are in fact opaque and their inner contents can be accessed only via API.
But a valid pdf magazine reader cannot limit itself to rendering, so you will be required to support zoom. Now as your maximum zoom level can be theoretically very high (don’t forget that characters in the pdf file are like fonts in the computer, they will never lose in precision even for extreme zoom-ins), it is impossible to render the full zoomed page in a canvas much higher than the device screen: here we have pixels, not vectors, and it would be immediate to crash the app because all the memory has gone away for one page only. So you will be forced to introduce tiling techniques that will limit the effective rendering to the visible part of the page, not always an easy task.

More difficult is document parsing: this is required if you want to extract outlines, annotations, do some text search and highlight. In such case apart a few meta data extraction functions, what the API gives you is a set of functions that will allow you to explore the data structures inside the document. You will not be able to get any information from the file if you don’t explore the data tree correctly and if you don’t follow the specs of the PDF document. This is worsened by the many versions the PDF specs got in the years and by the fact that many publishers still use old software that exports the content in the old formats. I have developed a general purpose PDF explorer, this was part of a commitment of a client that asked me to develop a general purpose PDF reader; but as it is really hard to apply all the specs of the PDF official reference, my suggestion is to concentrate on the most used features and test them with many documents. As I said before, CG navigates the data tree but it doesn’t interpret it for us!

The last section of this part, long explanation but required given the importance of the topic, is how to provide multimedia content on top of a PDF file: all in all the iPad is a so versatile device that we cannot limit ourselves to simple page rendering. By adding extra content to the printed page you can leverage the device characteristics and still taking benefit on the investment done in the magazine creation.
There are many reasons to justify this choice: e.g. a printed advertisement can offer a video instead of a static picture, or a printed link to a web page can be replaced by an active link to a web view, or finally we can show the current weather using an html5 widget. As I previously said it is not recommended to introduce all this content inside the pdf file: it will not be rendered by Quartz and you will still be forced to traverse the data tree to extract the CG object reference for further manipulation. Finally not all publishers are aware of these functionalities or their digital publishing software is too old to fully support them.
So the best solution is based on the “overlay technique”. This methodology consists in representing the pages in two layers:

  • the bottom layer (“rendering layer”) will contain the PDF rendering, so it will contain the bitmap image of the page;
  • the top layer (“overlay layer”) will draw all overlays and is sensible to user touches.
The overlay layer is typically made of UIKit components, so we’ll add a UIWebView for html widgets, we’ll introduce a UIScrollView to display a gallery of sliding images, or we’ll add a Media Player view for video execution. Typically the overlay descriptions are provided on a separate file, e.g. an xml, json or plist, and they will be packed together with the pdf file and all assets (movies, images, html files, music files) in a zip file.
The app will download the zip file, will unpack it and then for each page it will use the pdf page to fill the rendering layer, and the overlay information associated to that page to build the overlay layer.
Note that this technique can be applied also in the other rendering techniques we’ll talk about in the next paragraphs, in such case it allows to overcome many of the pdf format limitations. The major requirement for the developer is to define a suitable format, follow all page zoom and rotations with a corresponding overlay transformation and finally provide the publisher with the instruments and guidelines required to easily create such overlays.

Core Text rendering

Core Text (short: CT) is another of those technologies developed for the Mac and later ported to iOS. The Core Text framework is dedicated to text layout and font handling. Just to summarize the capabilities of this framework, consider that is at the base of the desktop publishing revolution that made the Mac famous in this professional sector.
As CG, even CT has a C-based API, even if there are several third-party open source wrappers that pack together the most common functionalities in a high-level Objective-C interface. CT should not be used to replace web based rendering based on html and css, this is a too complex field that is better to leave to dedicated system components such as then UIWebView instead it can be used to efficiently render some rich text. CT talks with CG, in fact text rendering is done at the same time of view Quartz based rendering. The two APIs have similar conventions and memory management rules, so the developer already accustomed with Core Foundation programming model will not find an hurdles in understanding the CT API. This gives the possibility to the developer to eventually mix the text rendering and image drawing at the same rendering stage (CT is limited to text only, it has no image drawing capabilities).

The main reason to use Core Text is because it does direct rendering of text on page without any intermediaries. It differs from PDF which consider each page as a whole, it differs from web based techniques as there is no intermediate language (html) or layout interpretation (css) in between, you can write directly on the page. The basic components behind CT are layout objects such as “runs”, which are direct translation of characters into drawable glyphs, “lines” of characters and “frames”, which correspond to paragraphs. The translation of characters to glyphs is done by “typesetters” and the text to be plotted is provided using attributed strings, which are common strings enriched with attribute informations (font size, color, ornaments).
You will decide to use Core Text for a magazine whose layout will be mostly based on text with standard layout, so it fits well for newspapers also. Probably it’s not the best choice for glamour magazines where graphics layout is changing on every page and could be quite complicated.
A clear advantage of the Core Text based solution is that you don’t need to apply the overlay technique we mentioned in the paragraph dedicated to pdf. With CT you will directly divide your page in frames and each of these frames will contain text (rendered by CT) or multimedia. Essentially you can define the page layout by selecting a size (it can fit the iPad screen or it can be vertical or horizontal scrolling page), then you will decide the size and position of media content in this page and finally you will define the frames (several rectangular frames) that will contain the text. The text frames organization can be of any kind, from compact single column structures, two multicolumn layout or varying size frames. Inside the frames you will render the text and Core Text will help you to manage line breaks for these paragraphs. Then you can easily provide the user the possibility to change font type and size and the same rendering code can be reused to quickly rearrange the text inside the frames. The page layout representation can be provided in any form decided by the developer together with the publisher, the best choice will be XML (all in all it’s the base of any markup format!) and it will be shipped to the app together with the texts (still XMLs) and the assets in a zip file package. One limitation of Core Text is that it is a text drawing technology and is not optimized for editing (but we don’t need it at this stage) and user interaction. This means that if we want to provide text highlight or select and copy features we’ll need to implement them by our own; the framework provides us some APIs to facilitate this task but in any case the code to implement these functionalities must be written by the developer to manage every single detail. In any case all these tasks will be greatly simplified in comparison with PDF: here you have full control of the text and its position of screen, while pdf is still an opaque entity hidden behind a complex data structure that you cannot control in its entirety.
Our recommendation is that if you must implement a digital magazine, without extreme layout requirements, some multimedia content and a fast and powerful control of text, using Core Text is the first technology choice to be considered. An excellent tutorial on the subject is available at this link on Ray Wenderlich blog

PAGES PRE-RENDERED BY IMAGES

This technique is heavily used inside the highly interactive magazines published using the Adobe Digital Solutions environment: well known examples are the Condé Nast magazines (Wired is one of the most famous examples). The way these magazines are implemented starts with the well known suite of Adobe Digital Publishing tools, In Design in primis. These tools are used by many publishers around the world and the latest versions offer the possibility to export the project, other than in the ubiquitous pdf format, in a package suited for distribution through iPad. The output of these files can be tested using the free app Adobe Content Viewer downloadable from the App Store, but of course the final branded app, together with the server infrastructure required to serve the contents, requires a higher tier license.

What characterizes this kind of magazines is that at the moment of project creation all pages are pre-rendered as jpeg or png images and then special effects are overlaid. This means that the core section of the magazine reader is essentially an image viewer. Sure these images will span an area slightly larger than the iPad screen, so they will be embedded inside a scroll view, but they are still images. All in all technically the choice is not bad: the iPad is quite better in rendering images than PDF files, as the required calculations needed to transform the pdf data in bitmaps is completely skipped here, while the CPU will just need to decompress the image and send it to the graphics hardware. Exactly as we did in the PDF case, we can apply the overlay technique to over impose some content that requires user interaction on top of the bottom rendering layer.

While this technique is highly efficient from the point of view of rendering time, and is simple to implement as all the page layout complexities have been taken into account and solved by the desktop publishing tools, it offers a few limitations that need to be considered:

  • every single page takes quite more space on disk and download time of this kind of magazines is increased correspondingly; in comparison with a pdf page, the space taken is much more as every pixel of text must be provided in the file and we cannot force high compression ratios if we don’t want to introduce blurring in the text. The pdf page, especially those pages made of text only, is much lighter as the text is not prerendered. With the New iPad Retina Display, things got much worse for this methodology.
  • zooming or font resizing is not feasible: both pdf and core text redraw the text using vectorial algorithms or per-size font representations, this is not possible to achieve on a static image. This means that the magazine needs to be drawn with specific fonts types and sizes, fonts which are well suited for jpeg compression (no blur) and the screen resolution (132 dpi, not so high; things are better with the new retina display iPad!)
  • text search, highlight and selection is impossible, unless the digital publishing tool exports together with the pre-rendered pages a full map of text coordinates, something I haven’t seen yet!

Adobe is not alone in publishing this kind of magazines: there are several custom apps in the market that follow exactly the same approach. It’s not bad but is not leveraging the great publishing frameworks that Apple is offering to its developers. And it has too many limitations if compared with other techniques. For sure a publisher that is mastering the digital publishing tools I mentioned before can take advantage of this approach, as the final quality is undoubtable and the time to market is the shorter, and at the same time allows to provide a content suited for the iPad, and not just a pdf fit on screen.
But I would recommend to all developers that are making custom products and are not using specialized page composition tools to stay away from such methodology.

HTML(5) APPROACH

The final technique is something that is emerging now, especially thanks to the great improvements in term of stability and speed introduced by the latest version of iOS for the in-app web views. A couple of good examples of this approach are the Ars Technica app and the Bloomberg Businessweek+ magazine.

The concept is quite simple: html and css are common and powerful techniques to layout a page on screen: why not leverage the skills developed by many web designers to make a magazine that perfectly fits with the iPad?
The core block at the base of this approach is the UIWebView Cocoa Touch object: with this view we can load any kind of html document, loaded locally or remotely, and layout it in the page at an adequate speed (but not the fastest) and without surprises. Besides we can get rid of the overlay technique, as the web view is capable of displaying images, playing movies and of course execute javascript based widgets. Also this component provides a two way interaction between the javascript world and the objective-c runtime (and in fact this justifies the existence of extension languages such as Objective-J, provided with the Cappuccino framework). Finally the web view is highly respondent to user interactions, and some features like text selection and dictionary lookup come for free.

The open-source world is highly active in this area: projects like Baker, Siteless, Laker and pugpig make publicly available this kind of solution. Note that they are more dedicated to e-books than magazines, but Baker is moving to the Newsstand and multiple-issues integration.

Sincerely we don’t know if this will be the final solution for everybody. Of course a publisher that already invested in setting up a web site (but not in Flash!), and this is quite common between newspapers, will be able to port most of the layout and contents to the iPad, and sometimes this can be achieved with an adaption of the CMS output views to provide files that can be easily fed to the app. Careful must be given to don’t push this behavior at its extremes: don’t forget in fact that web page rendering requires an inner engine and at the end any intermediate layer will require resources and extra time. Sometimes, and this is particularly evident with the first generation of the iPad, content updates following user interaction are not very reactive. So it is not recommended to transform every single aspect of the magazine app into web based content: clearly in this way you’re helping all javascript developers not skilled with objective-c, but a performance penalty will be visible. As an example, the toolbar typical of all magazine apps used to access extra features (sharing, table of contents, home page, etc.) should always be done using the native Cocoa Touch component and not an html+css solution.
However if the publisher accepts to convert his design flow to a web based one and you, as developer, prefer to base your work on consolidated and easy to manipulate methodologies, this one should be your first choice to be taken in consideration.

CONCLUSIONS

We hope this article gives a good overview of the major techniques used to render pages in a magazine, newspaper or e-book. It could be we have not mentioned some technique we’re not aware of, in such case dear reader any feedback from you is welcome!

Comments