5.2 Raster Image Processing
Wayne Collins
The raster image processor (RIP) is the core technology that does the computational work to convert the broad range of data we use to create a computer graphic into the one-bit data that drives a physical imaging device. Let’s examine the creation of a single character of the alphabet, or glyph. A font file delivers PostScript language to the RIP that describes a series of points and vector curves between those points to outline the letter A. The RIP has a matrix grid at the resolution of the output device and computes which spots on the grid get turned on and which are turned off to create the shape of that letter A on the output device. The spots on the grid can only be turned on or off — which is how binary data is encoded — either as 0 or 1. The grid then acts as a switch to turn a mechanical part of the imaging engine on or off.
With computer-to-plate technology for lithographic printing plate production, a laser is used to expose an emulsion on a printing plate. Most plate-setters have a resolution of 2,000 to 3,000 lspi (laser spots per inch). The RIP calculates all the spots that must be turned ‘on’ to create the graphic that will be imaged on the printing plate. If the image fills a typical sheet-fed press, it is (30 inches x 3,000 lspi) x (40 inches x 3,000 lspi) = 1.08 trillion, which takes 10 gigabytes of computer memory to store and transfer. A printing plate for flexographic print production is created by turning a laser on and off at a slightly lower resolution. An inkjet printer uses the same RIP process to deliver the same one-bit data to each inkjet nozzle for each colour of ink in the printer. Most inkjet engines have a resolution between 600 and 1,200 spots per inch — so the matrix grid is smaller — but if it is an eight-colour printer, the data for all eight nozzles must be synchronized and delivered simultaneously. An electophotographic (Xerox) printer usually has a resolution similar to an inkjet printer and utilizes a similar RIP process to change a grid of electrostatic charges to positive or negative on an electrostatic drum that is the maximum media size the machine can image. Each colour in the printer has a separate raster image that charges the drum in the right spot to attract that colour of toner to that exact location. The data for each colour must be synchronized for simultaneous delivery. The data must refresh the charge on the drum after each print in order to pick up new toner. That is a very important fact to remember when we talk about personalizing print with variable data later in this chapter.
This basic understanding of RIP’s place in a computer graphic workflow is essential to understanding how to prepare files for, and manage, RIP resources. It is also essential in solving some of the common problems we see in various RIPs. When we compare the two mass production imaging technologies, lithography and flexography, to the personalized imaging technologies, electrophotography and inkjet, we can identify some core similarities. In lithography and flexography, a high-powered laser is used to alter a physical emulsion that is durable and finely grained enough to let the laser image a spot that is one three-thousandth of an inch without affecting the spot of equal size beside it. We can reliably image that spot in a serif of a glyph set in one point type or a hair on a face in a photo that is imaged with a 5 micron frequency modulated (FM) screening pattern. The mass production technology assures us that the first print will be identical to the millionth print.
The raster grid of one-bit data that the RIP produces must be delivered to the imaging drum or the inkjet nozzle for every image that is produced with an inkjet printer or an electrophotographic engine. This is what allows us to make every image different and personalize it for the person we are delivering the image to. It also makes the process slower and less reliable for mass production. The RIP produces a lower resolution raster grid, so the detail in photos and letter shapes is not as precise. We can have a RIP discard data if we have too much detail for the raster grid it is producing. The RIP does not do a good job of interpolating more data to produce additional detail in a photo or graphic shape if that information is missing to begin with.
That brings us to examining the resources that a RIP must have to produce a perfect raster for every graphic shape it renders, and for every colour being reproduced. The resource a RIP consumes is data. In the graphic communications industry, we should all wear T-shirts that say ‘Pigs for data!’ just to distinguish us from our media colleagues who are producing computer graphics for electronic media. If we think of a RIP as an auto assembly line we are feeding with parts, in the form of files in different data formats, it will help us understand how to make a RIP more efficient. If we feed too many parts into the assembly line, it is easier to throw some parts away than it is to stop and recreate a part that is missing. If we feed the assembly line with five times as many parts needed to make a car, it is still more efficient to throw parts away than it is to stop and recreate a missing part.
If we apply this analogy to image resolution, we can point to examples where designers regularly repurpose images from a web page to use on a book cover or poster print. The web page needs to deliver the photo across a network quickly and only needs to fill a typical computer screen with enough detail to represent the photo. A typical photo resolution to do that properly is 72 pixels per inch. Now remember that the raster grid for a lithographic printing press that will print the book cover is 3,000 lspi. Our RIP needs much more data than the web page image contains! Most of the photos we are reproducing today are captured with electronic devices — digital cameras, phones, scanners, or hand-held devices. Most store the data with some kind of compression to reduce the data the device has to store and transfer. Those efficiencies stop at the RIP though, as this computational engine has to decompress the data before applying it to the graphic page it is rasterizing. It is like breaking a steering wheel down to wires, bolts, and plastic sleeves that efficiently fit into a one-inch-square shipping package, and putting this ‘IKEA furniture’ steering wheel onto an auto production line for the assembler to deal with in two-point-two minutes!
On the other hand, we can capture a digital photo at 6,000 pixels per inch (ppi) and use it on a page scaled to half the original dimension. That is like packing a finished steering wheel in 10 yards of bubble wrap and setting it on the assembly line in a wooden shipping crate! So it is important for designers to pay attention to the resolution of the final imaging device to determine the resolution that the RIP will produce from the graphic files it is processing.
Halftone Screening
It is important to stop here for a discussion about halftone screening that a RIP applies to photographs and graphics to represent grey levels or tonal values in a graphic element. We described how the RIP makes a grid of one-bit data, but graphics are not just black and white — they have tonal values from 0% (nothing) printing to 100% (solid) printing. If we want to render the tonal values in-between in half percent increments, we need 200 addresses to record the different values. Computer data is recorded in bits, two values (on and off), and bytes, which are eight bits strung together in one switch. The number of values a byte can record is 256 — the number of combinations of on and off that the eight bits in the byte can express. A computer records a byte of data for each primary colour (red, green, and blue — RGB) for each detail in a photo, as a pixel (picture element), which controls the phosphors on electronic imaging devices. A RIP must convert the eight-bit RGB values into the four primary printing ink colours (cyan magenta, yellow, and black — CMYK). There are two distinct steps here: (1) conversion from RGB to CMYK continuous tone data (24 bit RGB to 32 bit CMYK); and (2) continuous tone to one-bit screening algorithms. We have to be in the output colour space before we can apply the one-bit conversion. It converts the eight-bit tonal values into one-bit data by dividing the area into cells that can render different sizes and shapes of dots by turning spots on and off in the cell. A cell with a grid that is 10 laser spots wide by 10 laser spots deep can render different 100 dot sizes (10 x 10), from 1% to 99%, by turning on more and more of the laser spots to print. If we think back to the plate-setter for lithographic platemaking, we know it is capable of firing the laser 2,000 to 3,000 times per inch. If the cells making up our printing dots are 10 spots square, we can make dot sizes that have a resolution of 200 to 300 halftone screened dots in one inch. A RIP has screening (dot cell creation) algorithms that convert the data delivered in RGB pixels at 300 pixels per inch into clusters of laser spots (dots) for each printing primary colour (CMYK).
This description of how a RIP processes photographic data from a digital camera can help us understand why it is important to capture and deliver enough resolution to the RIP. It must develop a detailed representation of the photo in a halftone screened dot that utilizes all of the laser spots available. The basic rule is: Required PPI = 2 x lines per inch (LPI) at final size. So if you need to print something at 175 lines per inch, it must have a resolution of 350 pixels per inch at the final scaled size of the reproduction. Use this rule if you are not given explicit direction by your print service provider. You can use a default of 400 ppi for FM screening where lpi is not relevant.
WYSIWYG
It is important to know that each time we view a computer graphic on our computer screen, it is imaging the screen through a RIP process. The RIP can change from one software program to another. This is why some PDF files look different when you open them in the Preview program supplied with an Apple operating system than they do when opened in Adobe Acrobat. The graphics are being processed through two different RIPs. The same thing can happen when the image is processed through two different printers. The challenge is to consistently predict what the printed image will look like by viewing it on the computer screen. We use the acronym WYSIWYG (what you see is what you get) to refer to imagery that will reproduce consistently on any output device. Designers have faced three significant challenges in trying to achieve WYSISYG since the advent of desktop publishing in the early 1980s.
The first challenge was imaging typography with PostScript fonts. The second was colour managing computer screens and output devices with ICC profiles. The third and current challenge is in imaging transparent effects predictably from one output device to another. Font problems are still the most common cause of error in processing client documents for all imaging technologies. Let’s look at that problem in depth before addressing the other two challenges in achieving WYSIWYG.
Font Management
The development of the PostScript computer language was pioneered by Adobe in creating the first device independent font files. This invention let consumers typeset their own documents on personal computers and image their documents on laser printers at various resolutions. To achieve WYSIWYG on personal computer screens, the font files needed two parts: screen fonts and printer fonts. Screen fonts were bitmaps that imaged the letter shapes (glyphs) on the computer screen. Printer fonts were vector descriptions, written in PostScript code, that had to be processed by a RIP at the resolution of the printer. The glyphs looked significantly different when imaged on a 100 dpi laser printer than they did on a 600 dpi printer, and both were quite different from what graphic artists/typographers saw on their computer screen. That was not surprising since the shapes were imaged by completely different computer files — one raster, one vector — through different RIP processors, on very different devices. Many graphic designers still do not realize that when they use Adobe type font architecture they must provide both the raster screen font and the vector PostScript font to another computer if they want the document that utilizes that font to process through the RIP properly. This was such a common problem with the first users of Adobe fonts that Microsoft made it the first problem they solved when developing TrueType font architecture to compete with Adobe fonts. TrueType fonts still contained bitmap data to draw the glyphs on a computer screen, and PostScript vector data to deliver to a RIP on a print engine. The TrueType font file is a single file, though, that contains both raster and vector data. TrueType fonts became widely distributed with all Microsoft software. Microsoft also shared the specifications for TrueType font architecture so users could create and distribute their own fonts. The problems with the keeping screen font files with printer font files went away when graphics creators used TrueType fonts.
The quality of the fonts took a nose dive as more people developed and distributed their own font files, with no knowledge of what makes a good font, and what can create havoc in a RIP. Today, there are thousands of free TrueType fonts available for downloading from a multitude of websites. So how does a designer identify a good font from a bad font? The easiest way is to set some complicated glyphs in a program like Adobe InDesign or Illustrator and use a ‘convert to outlines’ function in the program. This will show the nodes and bezier curves that create the glyph. If there are many nodes with small, straight line segments between them, the font may cause problems in a RIP. Remember that PostScript was meant to be a scalable device independent programming language. If the poorly made glyphs are scaled too small, the RIP has to calculate too many points from the node positions and ends up eliminating many points that are finer than the resolution of the raster image. On the other hand, if the glyph is scaled too large, the straight lines between points make the smooth curve shapes square and chopped-looking. These fonts are usually created by hand drawing the letter shapes, scanning the drawings, and auto tracing them in a program like Illustrator. The ‘convert to outlines’ test reveals the auto tracing right away, and it is a good idea to search out another font for a similar typeface from a more reputable font foundry.
Another good test is to look at the kerning values that are programmed into the font file. Kerning pairs are glyph shapes that need the space between them tightened up (decreased) when they appear together. A good font usually has 600 to 800 kerning pair values programmed into its file. The most common pair that needs kerning is an upper case ‘T’ paired with a lower case ‘o’ (To). The ‘o’ glyph must be tucked under the crossbar of the T, which is done by programming a negative letter space in the font file to have less escapement when the imaging engine moves from rendering the first shape to when it starts imaging the second shape. If we set the letter pair, and put the curser in the space between them, a negative kerning value should appear in the kerning tool. If no kerning value appears, the font is usually a poor one and will cause spacing problems in the document it is used in.
Another common problem occurred when combining Adobe Type 1 fonts with TrueType fonts in the same document. Adobe was the creator of the PostScript programming language, and although it was easy enough to copy its code and create similar fonts, Adobe has maintained fairly tight control over licensing the PostScript interpreting engines that determine how the PostScript code is rendered through a raster image processor. The RIP stores the glyph shapes in a font file in a matrix that can be speedily accessed when rendering the glyphs. Each glyph is assigned an address in the matrix, and each font matrix has a unique number assigned to it so that the RIP can assign a unique rendering matrix. Adobe could keep track of its own font identification numbers but could not control the font IDs that were assigned to TrueType fonts. If a TrueType font had the same font ID number as the Adobe Type 1 font used in a document, the RIP would establish the glyph matrix from the first font it processed and use the same matrix for the other font. So documents were rendered with one font instead of two, and the glyphs, word spacing, line endings, and page breaks were all affected and rendered incorrectly. For the most part, this problem has been sorted out with the creation of a central registry for font ID numbers; however, there are still older TrueType font files out there in the Internet universe that will generate font ID conflicts in a RIP.
Adobe, Apple, and Microsoft all continued to compete for control of the desktop publishing market by trying to improve font architectures, and, as a result, many confusing systems evolved and were discarded when they caused more problems in the RIPs than they solved. There is a common font error that still causes problems when designers use Adobe Type 1 fonts or TrueType fonts. Most of these fonts only have eight-bit addressing and so can only contain 256 glyphs. A separate font file is needed to set a bold or italic version of the typeface. Some page layout programs will allow the designer to apply bold or italic attributes to the glyphs, and artificially render the bold or italic shapes in the document on the computer screen. When the document is processed in the RIP, if the font that contains the bold or italic glyphs is not present, the RIP either does not apply the attribute, or substitutes a default font (usually Courier) to alert proofreaders that there is a font error in the document. The line endings and page breaks are affected by the error — and the printing plate, signage, or printout generated becomes garbage at great expense to the industry.
To solve this problem, Adobe actually cooperated with Microsoft and Apple in the development of a new font architecture. OpenType fonts have unicode addressing, which allows them to contain thousands of glyphs. Entire typeface families can be linked together to let designers seamlessly apply multiple attributes such as condensed bold italic to the typeface, and have the RIP process the document very closely to what typesetters see on their computer screen. PostScript is also the internal language of most page layout software, so the same OpenType font files are used to rasterize the glyphs to screen as the printer’s RIP is using to generate the final output. There can be significant differences in the RIP software, but many font issues are solved by using OpenType fonts for document creation.
One common font error still persists in the graphic communications industry that acutely underlines the difference between creating a document on a single user’s computer but processing it through an imaging manufacturer’s workstation. Designers usually own a specific set of fonts that they use for all the documents they create. The manufacturer tries to use the exact font file each designer supplies with the document. The problem once again involves the font ID number, as each font file activated in an operating system is cached in RAM memory to make the RIP-to-screen process faster. So the font files the manufacturer receives can be different versions of the same font created at different times, but assigned the same font ID number. For example, one designer uses a 1995 version of Adobe’s Helvetica typeface and another uses a 2015 version, but the two typefaces have the same font ID number. The manufacturer’s operating system will not overwrite the first font matrix it cached in RAM, so it is the first font file that renders the document on screen and will be sent down to the RIP. Usually, there are few noticeable changes in the glyph shapes. But it is common for font foundries to adjust kerning values between letter pairs from one version to the next. So if a manufacturer has the wrong version of the font file cached in RAM, a document can have line-ending changes and page reflows. This is a hard error to catch. There are programs and routines the imaging manufacturer can implement to clear the RAM cache, but many times, more ‘garbage’ is generated before the problem is diagnosed. Modern PDF creation usually includes the production of a uniquely tagged font subset package that only contains the glyphs used in the document. The unique font subset ID avoids the potential for font ID conflicts.
Managing fonts on a single user computer has its own challenges, and Apple has included Font Book with its operating systems to help manage fonts in applications on an Apple OS. Adobe offers Typekit with its latest Creative Cloud software to provide greater access to a wide variety of typefaces from a reliable foundry. Third-party font management programs like Suitcase Fusion also help graphic artists manage their fonts for repurposing their documents effectively. It is still the responsibility of individual operators to know how to use the fonts in their documents. They should also make sure that the fonts are licensed and packaged to deliver to other computer systems so that they can drive many different RIPs on a wide variety of output devices.