6.7 Variable Data Printing
Roberto Medeiros
Variable data printing, or VDP, refers to a special form of digital printing where document content is determined by entries in a record or data set and can be highly personalized. Varied text, graphics, and images are typical content elements, but layout, element positioning, and even document choice are just some of the other variables. Because the content on the printed page is constantly changing, it would not be feasible to produce this type of print product with traditional offset lithography or with any other process that requires a fixed image plate. Electrophotographic and ink jet printing are ideally suited for this type of printing as each page is imaged individually.
VDP can take many forms. Transactional documents like invoices and statements are probably the oldest form of VDP, but these have evolved to include marketing or informational content. This is known as trans-promo or trans-promotional. A mail merge is a simple form of VDP where a static document has data elements added directly to it. Each record in the data set produces one document. Another VDP form is when you enter the record manually or upload a simple text-based data table, which then fills the content of a template. This method is typically found in web2print solutions and produces items such as business cards, where the layout, fonts, and required elements can be predetermined and the content based on the data entered. More advanced VDP solutions may include campaign management tools, workflow management, two-dimensional barcode generation, image-based font technology, and integration into external systems such as databases, email, web2print solutions, data cleansing, or postal optimization solutions.
One of the core purposes of VDP is to increase response rate and, ultimately, conversions to the desired outcome. In order to accomplish this, it is critical that the content presented is relevant and has value for the intended audience. Today, there are massive amounts of data available on customers and their behaviour. Analyzing and understanding customer data is essential to maintaining a high degree of relevancy and engagement with the customer.
VDP can be broken down into six key components: data, content, business rules, layout, software, and output method. Each component can vary in complexity and capability and may require advanced software solutions to implement. However, even the most basic tools can produce highly effective communications.
Data
Data used for VDP can be simply thought of as a table or data set. Each row in the table is considered a single record. The columns are the fields used to describe the contents of the record. Some examples of columns or fields would be first name, last name, address, city, and so on. The simplest and most common form of representing this table is by using a delimited plain text format like comma separated value (CSV) or tab delimited. The delimiter separates the columns from one another and a new line represents a new row or record in the table. Here is an example of CSV data:
“FirstName”,”LastName”,”Gender”,”Age”,”FavQuotes”
“John”,”Smith”,”M”,”47”,”Do or do not, there is no try.”
“Mary”,”Jones”,”F”,”25”,”Grey is my favourite colour.”
The first row contains the row headers or what the fields represent and is not considered a record. You’ll notice that each field is separated by a comma but is also enclosed within quotes. The quotes are text qualifiers and are commonly used to prevent issues when the delimiting character may also be in the contents of the field as is the case with the first record above. Many VDP applications support more advanced, relational databases like SQL, but a query must be performed to extract the data to be used, which ultimately results in the same row and column record structure. The data must be attached or assigned to the document in the page layout or VDP application.
Content
Content refers to elements displayed on each page. This would include text, graphics, and images, both static and dynamic. Dynamic content uses placeholders, typically named by the column headers of the data, to mark the position of the element and reference the data in the specific column of the current record. When the document is rendered, the placeholder is replaced by the record data element.
“Dear <<FirstName>>…” becomes “Dear John…” when the document is rendered for the first record and “Dear Mary…” for the second record, and so on. A complete document is rendered per record in the dataset.
Business Rules
Business rules are one of the key elements that make VDP documents highly useful. They can be thought of as a series of criteria that are checked against the data to determine what gets displayed on the page. They can also be used to manipulate the data or filter out relevant content. In almost every case, some level of scripting is required. Advanced VDP solutions have built-in scripting capability, utilizing either common scripting language such as VBScript or JavaScript, or a proprietary scripting language that is only applicable in that specific application. If the page layout tool you are using to create your VDP document does not have scripting capability, you can apply business rules to data beforehand in a spreadsheet application like Microsoft Excel or even Google Sheets.
One of the most common methods for implementing a business rule is using a conditional or IF statement comprising a logical test, an action for a ‘true’ result, and an action for a ‘false’ result (see Figure 6.11).
The logical_test means that the answer will be either true or false. In this case, we want to change our graphics based on gender.
In plain English, you may say:
“IF gender is male, THEN use plane_blue.tif, or ELSE use plane_orange.tif”
In scripting, it would look something like this:
IF(Gender=”male”,”plane_blue.tif”,”plane_red.tif”)
When doing this in a spreadsheet, you would enter the script in a cell in a new column. The result of the script is displayed in the cell, not the script itself. This new column could now be used to specify the content to be displayed in your layout application. In dedicated VDP applications, the script is attached to the object itself and is processed and displayed in real time.
The “@Plane” column was added to dynamically change a graphic based on the contents of the cell in the “Gender” column (B2) (see Figure 6.12).
Business rules can also be applied to the VDP workflow. In this case, the workflow application or component can manipulate the data before applying it to a document, or it can select the document or destination to be used for the record and much, much more.
Layout
When working with variable data documents, there are special layout considerations you should be aware of. Because word lengths will change per record, there needs to be sufficient space to accommodate the largest and smallest record, and prevent oversetting while maintaining the desired visual appearance. This challenge is compounded by the prolific use of proportional fonts. Character widths differ with each letter so word lengths will vary even when the number of characters is the same. This can also force a paragraph to reflow onto another page and change the number of pages in the document. Additional scripting may be required to handle reflow scenarios. Some applications use special copy-fitting algorithms to dynamically fit text into a defined area. The use of tables for layout purposes can also be helpful. Because we are dealing with dynamically generated documents, we may also want to vary the images. Using images with a consistent size and shape make it easier to work with. Transactional documents, such as statements and invoices, extensively use numbers. Most fonts, including proportional ones, keep numbers mono-spaced. In other words, every number character occupies the same amount of space. This is important because, visually, we want numbers to be right justified and lining up vertically in columns with the decimal points aligned. There are, however, some fonts that do not follow this common practice. These fonts may be suitable for use in a paragraph but are not for displaying financial data.
Software
Software that can generate a data-driven document is required for variable data printing. In the early days of VDP, there weren’t many choices for designers. It was common practice to hand code VDP in PostScript, since it was both a programming language and a PDL. Applications like PageMaker and Illustrator were PostScript design applications but lacked VDP capabilities. Applications like PlanetPress emerged as dedicated PostScript VDP applications. Today, designers have a wide variety of software available for creating VDP. There are three basic VDP software types: a built-in function within a page layout or word-processing software, a third-party plug-in, or a dedicated VDP application.
Microsoft Word, for example, has a mail merge function but does not have the ability to vary images, just text. Adobe InDesign has the data merge function, which is basically a mail merge but includes the ability to vary images as well. In both these examples, business rules would be applied to the data prior to using it in these applications.
There are a number of plug-ins available for InDesign that are very sophisticated. These leverage the extensive page layout capability of InDesign while adding scripting and other VDP specific capabilities. XMPie and DesignMerge are examples of these types of plug-ins. FusionPro is another plug-in based VDP product, and while it does have an InDesign plug-in, it only uses this to allocate variable text and image boxes in the layout. Business rules and specific content are applied in its complement plug-in for Adobe Acrobat.
PlanetPress and PrintShop Mail are examples of dedicated applications that combine both page layout and VDP functions. Although they are very strong in VDP functionality, they sometimes lack the sophistication you’d find in InDesign when it comes to page layout. These particular applications have recently moved from PostScript-based VDP to a more modern HTML5 and CSS (cascading style sheets) base, making it easier to produce and distribute data-driven documents for multi-channel communications.
Output Method
The ‘P’ in VDP stands for printing and is the main output method we will discuss here. However, document personalization and data-driven communications have evolved to also include email, fax, web (PURL, personalized landing page), SMS text messaging, and responsive design for various mobile device screen sizes. With the emergence of quick response (QR) codes, even printed communications can tap into rich content and add additional value to the piece. In order to take advantage of these additional distribution and communications channels, a workflow component is often employed.
A key element for optimized print output of VDP documents is caching. This is where the printer’s RIP caches or stores repeatable elements in print-ready raster format. This means the RIP processes these repeating elements once, and then reuses these preprocessed elements whenever the document calls for them. This does require a RIP with enough power to process large amounts of data and resources with support for the caching scheme defined in the VDP file but, ultimately, allows the printer to print at its full rated speed without having to wait for raster data from the RIP.
There have been many proprietary VDP file formats that have striven to improve performance of VDP over the years, but the industry is moving rapidly toward more open standards. PODi, a not-for-profit consortium of leading companies in digital printing, is leading the way with two widely adopted open VDP standards. These standards are PPML (Personalized Print Markup Language) and PDF/VT (portable document format/variable transactional).
PPML
PPML, first introduced in 2000, is a device independent XML-based printing language. There are two types of PPML: thin and thick. Thin PPML is a single file, with the .ppml extension, containing all the instructions necessary for producing the VDP document. It does include caching instructions; however, all resources such as fonts or images are stored externally of the file. The path to these resources is defined in the RIP and retrieved during the rendering process of the document. Thin PPML is ideal for in-house VDP development where resources may be shared by multiple projects. These files are extremely small, however, and network speed and bandwidth may affect performance and are more difficult to implement if using an external print provider. Thick PPML is a .zip file containing all the required resources (fonts, images, instructions, etc.). This format makes the file highly portable and easy to implement on the print device, but it has a larger file size when compared to thin PPML. RIPs that support the PPML format can import the .zip file directly. Regardless of the type used, PPML benefits from exceptional performance, an open standard, open job ticketing support (JDF), and overall reduced file size. To generate PPML, an advanced VDP solution is required.
PDF/VT
PDF/VT is a relatively new international standard (ISO 16612-2) that has a lot of potential. It is built off the PDF/X-4 standard, benefiting from its features, such as support for transparency, ICC-based colour management, extensive metadata support, element caching, preflighting, and much more. In short, PDF/VT includes the mechanisms required to handle VDP jobs in the same manner as static PDF printing allows print providers to use a common workflow for all job types, including VDP. Many of the latest releases of advanced VDP solutions already support PDF/VT as well as many DFE manufacturers.
For more information on PPML and PDF/VT, please refer to the PODi website at: http://www.standards.podi.org