STA553 E-Pack: Data Visualization
West Chester University
Topic 1 Introduction
The term data visualization has been used for a long time. It is still an evolving field due to the continuing advancement of computing technology.
Data visualization is the presentation of data in a pictorial or graphical format, and a data visualization tool is the software that generates this presentation. Data visualization provides users with intuitive means to interactively explore and analyze data, enabling them to effectively identify interesting patterns, infer correlations and causalities, and support sense-making activities.
1.1 Data or Information Visualization?
Sometimes, data visualization is also called information visualization. Can these two terms be used interchangeably? To answer this question, we need to know what is data and what is information.
There are different versions of definitions for data and information.
Data are facts, figures, observations, or recordings that can take the form of images, sound, text, or physical measurements. Data can come from many sources and it can be split into two groups based on the form it takes: structured data and unstructured data.
Structured data - typically categorized as quantitative data - is highly organized and easily decipherable by machine learning algorithms.
Unstructured data - typically categorized as qualitative data, cannot be processed and analyzed via conventional data tools and methods.
Information is a collection of data that has been processed, organized, or structured in a meaningful way to convey knowledge, ideas, or instructions. It can be communicated through various mediums, such as text, images, audio, or video, and can be accessed and shared through multiple channels, such as books, websites, and social media.
Relationship between Data and Information: Data is meaningless and has no significance. Information is processed data and has meaning and significance. Information is dependent on data.
Based on the above definitions of data and information, it is more appropriate to call data visualization information visualization.
Another concept related to what we will do in this course is scientific visualization. According to the definition
Scientific visualization: the representation of data graphically as a means of obtaining comprehension and insight into the scientific data. It can also refer to visual data analysis.
1.2 Aesthetic Considerations
Data visualization is both an art and a science. Aesthetically designed visualization makes the visual representation of data and information more effective. With the advances in the development of graphical software, aesthetic features have been increasingly used in various visual designs including data and information visualization. The challenge is how to create aesthetically attractive visualizations without misleading and distorting information.
The basic elements of data aesthetics are shape, size, color, position, orientation, font type, font size, and many others.
The key to creating an aesthetically effective and persuasive data visualization is how choose the right tools for the right data/information. Keep visualization creative but simple! The following are a few examples that illustrate the right visual tools for the right types of data.
- Visualizing the magnitude of data
- Visualizing the proportion of data
- Visualizing the distribution of data
1.3 Topic Coverage
This course focuses on both static and interactive data visualization using programmatic and non-programmatic approaches using R and Tableau. We will also briefly discuss the basic principles of visual design and its application in data visualization. We will cover a wide range of topics from technical tools and platforms commonly used in visualization, basic statistical plots, and interactive graphics, to dynamic dashboards.