96 pages, published in 2011
The path from journeyman to master is long. In the case of data visualization, the path has been well marked by many accomplished designers and cognitive scientists who have been doing great work for decades. We gladly follow in their footsteps, and we hope you will, too.
In these pages, however, our goal is not so much to take you to the summit as to start you down the path—and that the path is quite rewarding to travel. Our goal is to give you confidence as you begin your journey.
Many statisticians and practitioners with excellent coding and data munging skills are nevertheless stuck in a rut of common formats and default settings, which lead to mundane, suboptimal visualizations. But the domain of hand-crafted, fine-tuned, note- worthy visualizations is not limited to “creative” types; it is accessible with a bit of guidance.
The truth is, there is plenty of room for artistry and creativity in data visualization. But success is built upon a linear process that encodes information for visual transmission and subsequent decoding by wetware—the reader’s brain. One aim in writing this book is to introduce you to this process, including some basic concepts and best practices, so that your message may be transmitted with minimal interference.
It is a process. And design is something you’re probably already doing, whether you’re designing applications, frameworks, graphics, or something else. Design is simply a process of organized thinking, planning, and executing. You are making design choices, intentional or not. Of course, intentional choices have a better chance of being useful than arbitrary or accidental choices.* This book is a road map to those choices: it is meant to make you aware of the choices you get to make, and to help you make useful, intentional design decisions at every turn.
This high-level road map is one we haven’t seen presented anywhere else. It will give you the general lay of the land. It is a set of steps and rules to follow that will get you 80% of the way to turning out great work. We’ll introduce many questions you’ll need to ask yourself, and point you in the direction of some answers. The nuanced details
* Suh: The Principles of Design (Oxford University Press); Schon: The Reflective Practitioner (Basic Books). vii
of those answers have already been addressed by others, and we hope you will continue down the path with further guidance from our colleagues and mentors (see the Reading List in Appendix A).
Note that Appendix A also covers some of the many tools available for creating data visualizations, and we hope you will peruse them. But you’ll find a discussion of tools intentionally missing from the rest of the book, because the topic at hand is, “What problem are you solving?” (and the questions you’re answering), rather than, “What tools are you using?” Design and implementation are two separate things.
As in any creative discipline, the best data visualizations are forged by breaking some of the rules. But rules must be broken with intention. One must learn the rules (well, more like guidelines) before one is entitled to break them. With that in mind, we present for your consideration our process for the visual encoding of information.
How This Book Is Organized
This book is organized into two major parts, which can loosely be thought of as practical theoretical foundations and applied suggestions, respectively.
In Part I, we discuss different kinds of visualization (including infographics and visual art) and explore the influences at work in each one. The goal is to help you become a more savvy consumer of visualizations, as well as a more organized thinker when cre- ating your own visual work.
In Chapter 1, we introduce some ways of classifying and describing different styles of visualization, so that you can begin to think about and describe what you’re designing.
In Chapter 2, we introduce the three fundamental influences to the visualization prod- uct—the designer, the reader, and the data—and describe how each should shape what is eventually created.
In Part II, we apply these concepts to the design process. The goal is to help you think in a linear way about how to select and apply appropriate encodings for your data.
In Chapter 3, we focus on getting to a clear understanding of your goals—and defining the requisite supporting data—so that you can implement them most effectively.
In Chapter 4, we lay out heuristics for understanding the shape of your data and choos- ing compatible visual properties and structures with which to encode it.
In Chapter 5, we dive deep into the property of spatial position—axes and placement— one of the most important properties you’ll need to select. We also discuss using dif- ferent visualization structures.
In Chapter 6, we look at best practices and offer specific suggestions for encoding many specific different data types with visual properties. We also present warnings against common pitfalls and dark patterns.
viii | Preface
Finally, the Appendices are full of resources and references meant to help you put your skills into practice and expand your knowledge beyond this volume.
Appendix A contains a list of tools to help you get started, as well as a suggested reading list to expand your knowledge and understanding of design concepts.
Appendix B is a list of the questions and decisions you’ll confront as part of the design process. We hope you’ll read the entire book, and then use this section as a refresher whenever you design a new visualization.
What We Mean When We Say...
In this book, we’ll use some specific terms to describe your data and visual encodings. Here is a handy glossary for quick reference.
Chart: Something that shows qualitative information (e.g., flow charts).
Data dimensions: One single channel of data. A stock graph may comprise four properties: date, price, company, and market cap. Each is a unique dimension of the data, which can be encoded separately, with a different visual property.
Data visualization: Visualizations that are algorithmically generated and can be easily regenerated with different data, are usually data-rich, and are often aesthet- ically shallow.
Designer: The creator of a visualization; any reader of this book.
Encoding: The visual property (noun) applied to a dimension of data that enco- des (verb) the information into a visual medium for decoding by the reader’s brain. Explanatory visualization: Data visualizations that are used to transmit infor- mation or a point of view from the designer to the reader. Explanatory visualiza- tions typically have a specific “story” or information that they are intended to transmit.
Exploratory visualization: Data visualizations that are used by the designer for self-informative purposes to discover patterns, trends, or sub-problems in a data- set. Exploratory visualizations typically don’t have an already-known story. Graph: Something that shows quantitative information (e.g., pie graphs and bar graphs).
Infographic: Visualizations that are manually generated around specific data, tend to be data-shallow, and are often aesthetically rich.
Reader: The consumer of a visualization, often someone other than the designer. The reader has information needs that are meant to be satisfied by the visualization. Visual property: A characteristic that you can see. Color, size, location, thickness, and line weight are all visual properties.
Variability of a property or data dimension: Within a visual property or single data dimension, what values are present or allowed, and how they change. Integers vary discretely; position can vary continuously. Categories are finite (and discrete, though maybe hierarchical); numbers are infinite
Noah Iliinsky, Julie Steele