Augmenting Visualization Tools with Automated Design & Recommendation

Augmenting Visualization Tools with Automated Design & Recommendation


>>Okay, it gives me
great pleasure to welcome Kanit, and I’m not going to
attempt your last name, Ham. He is a graduating PhD student at University of Washington with Jeff Hera’s group and
he’s done some fantastic tools. I don’t want to steal
too much thunder but I’m really happy to have
you here today and hope you all can
tune in and also come in person to ask
questions. Thanks.>>So, everyone hear
me okay? It’s cool.>>What’s your last name?>>I can say my last name. Hey, my name is
Kanit “Ham” Wongsuphasawat. Today I’m going to talk
about my research on augmenting
visualization tools with automated design
and recommendation. So let me begin by introducing a little bit about myself and my research interests are added intersection between
user interface systems, information visualization
and data science. And my research
mission is to help people work with and
benefit from data. And with this mission in mind, my PhD research focused on the design of intelligence
and visualization systems. So today is probably
indisputable that visualization is critical tool
for data science. We have seen analysts use plots as one of the first tool
look at that data. Starting from looking
at distribution, to exploring potential
relationships in the data. And even when we create
a machine learning model, it’s important to check that the input data for
the model doesn’t have any data quality issues
like viruses and so on. Besides looking at input data and analyzing model performance, developers also
use visualization like diagrams to understand and communicate
complex structure like deep learning model
architectures. Since visualization is
critical for data science, my research explore how can
we provide automation to help people create visualization more effectively and
with less efforts. And in this talk, I
will show how I design visualization tools
by augmenting them with automated design
and recommendation. For the main part of this talk, I will show how I design formal languages for
chart specification and recommendation and
use them to build recommendation power interfaces
for data exploration. At the end of the talk
I will also show how I design tools that ship in TensorFlow to help developers visualize structure of
deep learning models. Although these two
are research projects, all the tools that you
see in this talk are open sourced and adopted by data science and
research communities. And before I dive into
the details of these tool, I’d like to highlight that
the common challenge for all these tool is to find the right balance between
automation and user control. On one hand we want to
provide automation to guide users with best practices and reduce tedium
in their work flow. On the other hand,
we want to preserve user control to let them steer the automation and
have flexibility to leverage their domain knowledge and intuition to
achieve their goals. So let’s first see
how I balance between automation and user control
in exploratory data analysis. So let’s consider when analysts receive a new data set that
they haven’t seen before. In an ideal exploration, a good analysts should
perform two high-level task. First, they should begin
their analysis with a bar exploration to familiarize themselves with different fields
in the data set. After getting a broad overview, they may focus on answering
specific questions. For example, they may examine if a pair of wire both
are co-related. Investigating these questions
may spark exploration of other potentially
relevant fields and interleading again to
more focused analysis. So this is an ideal
workflow that we hope analysts would do
in the exported data. However, there are
a few reasons that prevent analysts from achieving
this goal in practice. First, many novice analysts often have what we call tunnel vision instead of having discipline to explore
different aspects of the data. They may all look
data quality issues, like unexpected data value, like some input entry errors, are prematurely fixate on specific question or hypothesis. Even for well-trained analysts, they may have limited time for some analysis projects and since creating plots
can be tedious, a well-trained analyst
may suffer from these common pitfalls as well. So now, let’s see
what I mean tedious by considering visualization tool
for data exploration. Most of the tools typically
require a certain degree of manual shots of creation by a programming language
like D part two in R, via graphical interface like Tableau or Megaflops
for Power B.I. These two are very powerful
for asking questions because they can
create a variety of charts and answer
a variety of questions. However, to provide
a comprehensive vocation for a single chart, analysts should have to
make a number of decisions. From a data set like
a cars collection and this must first select data fields or column
in a data table, like horsepower and
number of cylinders. To summarize the data, she made in a pi
data transmission, such as aggregating mean of horsepower and get group
by the number of cylinders. And finally, to produce a chart, she has to decide
visual encodings by selecting the mark type like bar and mapping data fields encoding channels like X and Y. And all these tedious step are just creating one chart
in an axis process. As a result, this tedium
of manual specification, combining with analysts’ lack of either discipline or time, can impede exploration
and cause analysts to overlook important
insights in the data. To address this problem, my research focuses
on designing tools to facilitate rapid and systematic data exploration with
chart recommendation. To achieve this goal,
we actually have to answer a number of questions. First, how should we design interfaces that surface
this chart recommendation? At the same time,
how do you allow users to have control over
these recommendations? Under the hood of
these interfaces, how do we build the chart recommender
engine that allows user to still have control over
the recommendation? And to do all of this, we need a representation
of the charts themselves. In this first part of the talk, I will show you a system that I designed
to answer each of these questions starting from the Vega-Lite concise grammar, which serve as
the chart representation. Based on Vega-Lite, I then be visualization query language and recommendation engine
called CompassQL. Finally, I then use CompassQL to build a series
of interfaces called Voyager which enable new forms of data exploration with
chart recommendation. So let’s begin
from the foundation of this stack of tools, which is this chart
representation. So, to support
a broad range of graphics, many popular
visualization tools adopt the idea from the Grammar of
Graphics by Lee Wilkinson. The idea is that, a grammar
of graphics can provide primitive building blocks for composing broad range
of visualizations, just like English grammar
inform us how to converse words into sentences. In spite from this idea, we decide the Vega-Lite grammar for representing charts in our recommendation
system and for supporting rapid creation
of interactive charts. To support these goals, Regalia offer
sharp building blocks in a concise language, during inspiration
from language like D plot and this QR language,
underlying Tableau. To provide concision, Regalia provide sensible defaults for low level details
and allow users to override these defaults
to customize their plots. While this QR and
D plot, and so, while QR is proprietary
and D plot is highly embedded to
the R environment. Regalia instead offer
a universal JSON format and an open source
JavaScript library. And by building on
web-based technology, Regalia can easily support the development of
new higher level system, like chart recommendation,
and also support usage across multiple
programming languages. Beyond existing grammar
like GeoPort and VizQL, we also offer
building blocks for composing multi-view
and interactive charts. So let’s see how Vega-Lite
provide building blocks for creating a chart using
a histogram as an example. A histogram is essentially bar marks with x position encoding a BINned field and
the y position then encode count of values
within each BIN. Vega-Lite basically
provides adjacent syntax to define and compose
this structure. First, we can describe
the data, in this case, from a URL, and insert
graphical mark type to a bar. We can then define encoding or mapping between data fields and visual properties
like x and y. Here, temperature
is mapped to x. We can also define
data transformation like BINing within the encoding, and finally map
aggregated count to y axis. Now, we get
a representation that reveal underlying structure
of a histogram. Later in this talk, I will show that this kind of
representation also enables us to reason about and run
recommendations for sharps too. Note that this kind of histogram is concise because
under the hood, Vega-Lite automatically
generates sensible defaults for low-level details
like adding linear scale for
the quantitative view and adding axes automatically
for both x and y encoding. With this grammar-based
building blocks, we can also add
more encoding to the chart. For instance, you can
encode the weather type with color and we get
a stacked histogram. Here, Vega-Lite
automatically use a categorical color palette to encode weather type
since it’s a nominal field, instead of using
color ramp which is better for quantitative view.>>So, it’s not
specifying grouping. You’re grouping by the weather, sun, snow, rain, fog.>>So that’s [inaudible]
from the specification. So part of this is like
supplying both the encoding, another part is that’s part of discussion also
simplified data query. So when you have three fields, it’s kind of like, in SQL, you do select count. That means you have aggregation. And then whenever
something is not aggregated in this SQL query, then the other things are
things that you could buy.>>So it’s a [inaudible].>>So we can infer the SQL query
from the specification. Even though we
provide default color, user can stilll
customize the color by overriding
default scale properties. For example, we can make
sunny weather, yellow. With this kind of
concise syntax, we can create and
recommend a variety of charts and we also extend Vega-Lite beyond
existing grammar with an algebra for composing multi-view graphics
using operators like Repeat for creating
Scatterplot Matrix, Concatenation and Layering, and also Faceting for
creating small multiples. And we also provide them
a building blocks for satisfying interactions
on these composed views. And within Vega-Lite
all these building blocks are available in a
single, unifying language. In this talk though, I’m
going to focus more on the applications of Vega-Lite especially on
chart recommendation. So I wouldn’t cover detail
about our syntax for real composition
and interactions but if you want to
learn more about them, feel free to look
at our public talk videos available on YouTube, on our website,
and research paper. I believe that a version of this talk that
we gave here is also recorded in Microsoft
online platform too. So internal
applications, let’s first talk about open source
usage of Vega-Lite. As you can see from the example, you can write Vega-Lite code
to quickly create charts. And to aid our users, we also provide
online editor that helps validate and
auto-complete syntax, so users don’t have to remember
everything in the syntax. And in terms of usage pattern, we have seen people use
Vega-Lite in publication. For example, a recent book
by Danyel Fisher here and Miriah Meyer comes with an online gallery of
Vega-Lite examples. Leading academic journal like Nature also mentioned
that tools like Vega-Lite also make
scientific data more accessible
and reproducible. With adjacent syntax,
Vega-Lite can also serve as a
visualization file format. We are excited that JupyterLab, which is the next version of the Jupyter Data
Science notebook already shifts with
native support for Vega-Lite. And although our main library
is in JavaScript, our collaborators have developed native APIs to wrap
Vega-Lite in other languages. For example, Altair is
a popular Python wrapper for Vega-Lite and the feedback from the Python community
is very encouraging. For example, a review
by Dan Saber said that, “It’s this type of one-to-one
-o-one mapping between thinking code and visualization that’s my favorite thing
about Altair. ” And you see Vega-Lite being used in many leading companies. For this month alone,
it’s download over 80,000 times on NPM
and it’s growing. We also receive over 1000 times
on GitHub, and in fact, the Altair wrapper
actually received more [inaudible] Vega-Lite in part because Python [inaudible]
community is larger. We are very excited about this adoption and
usage in industry. And besides supporting manual chart creation
like you see, Vega-Lite can support
chart recommendation too. So let’s see how we do that in the CompassQL visualization
query language and engine. So I mentioned earlier that manual chart creation can be
tedious and that’s in part due to the fact that you still have to provide
complete specification of map type, encoding
and transformations. To reduce the the tedium, one idea is to instead let user provide only a part
of the specification. And this is the key idea
in CompassQL. Basically, we use
partial Vega-Lite specification as a way to describe
a recommendation query. This way we can reduce the need to provide
complete specification while still giving
[inaudible] and control for users over
their recommendation. So to see how CompassQL works, let’s see a basic query that tried to do automatic
mark selection, similar to a feature in Tableau. So let’s first begin with a complete Vega-Lite
specification, which describe a single chart. Note that here I
show Vega-Lite in a table format to save space. Now if we want a partial specification that ask the engine to
determine a mark, we can use a wildcard
to omit mark. And given a wildcard,
the CompassQL engine then enumerates
candidate charts by replacing the wildcard with
all possible marks. However, just simply
enumerating all “Mark types”, may produce charts that violate
basic design principles. For example, origin in this plot
is a categorical field. If you used line or area to encode category that
doesn’t have other, you might suggest trend
that doesn’t really exist in the data.
So, that’s misleading. Thus, the CompassQL engine also includes built-in
design constraints to prune misleading encodings. After pruning, we still have
multiple qualifying charts. The second part of the query is the recommendation method, which is for organizing
the results of recommendation by grouping and ranking the qualifying charts. In this case, to
choose one best chart. We can tell the engine to choose the best chart based on a perceptual
effectiveness ranking, which we derived from prior work
on graphical perception. And since in this location, the only wildcard is the mark, the ranking then consider
effectiveness of “Mark type” based on the type of fields on
X and Y visual channels. And since bar is
a better choice for encoding quantitative and
categorical fields than point, CompassQL will
choose the bar chart as a top recommendation
in this case. With this query we can replicate Tableau’s Automatic Mark
feature, which is kind of cool. However, while
Automatic Mark can save time for user for one step, one limitation is that
users still have to select data fields that
they want to visualize. And moreover, the goal of CompassQL is
enabling your query, not just replicating
existing features. So let’s see a more
advanced query that enumerates both data
and visual encodings. So, suppose we want to see pairwise relationships between
all quantitative fields of a given dataset. We can make a query with two encoding mappings and make a real property
from “Mark”, “Visual channels”, and
“Fields” of the wildcards. But then, we can
constrain the fields to be just quantitative fields. And seeing this many
wildcards, you might say, wow, wouldn’t this
create so many charts? Well, the answer
is yes of course. But that’s why we need to
provide recommendation method, and in this case
we’re going to want to provide a method to
group redundant charts. And since we want to see plots showing different power fields, we’ll group the charts by
the data fields that they show. And we see some more
structured grouping here. But still a lot of
charts in each group. Another step is to choose the most perceptually
effective chart in each group, and then we get a scatterplot for each pair of
quantitative fields. To choose a scatterplot, the ranking consider
both the effectiveness of Mark type and
the encoding mappings. For the mark, point is best for encoding pairs of
quantitative fields, so we use just point. For the encoding mappings, you use prior knowledge
from graphical perception, that people can decode quantitative values
from position encoding better than length. And length is in
turn better than angle and areas and so on. With this kind of
knowledge, we can run the effectiveness
of the encodings. In this case, we have
two quantitative fields. So CompassQL will use X and Y position as they are the most
effective encodings. And as a result, then we can get this query of scatterplot that show all pairs
of quantitative fields. As you see in the next section, this is actually
a query that user can specify via
a graphical interface. So with this kind of
CompassQL queries, I didn’t use iterative
design process to develop a series of
systems called Voyager. In the start, I will
only briefly show the original Voyager and its user study that motivated
us to develop Voyager 2. And I will then show
the Voyager 2 system in detail, including how we
use CompassQL to enable new interaction
methods in the system. As I mentioned earlier, that one motivation for
doing all this type of tools, is that manual chart creation
can impede exploration. So let’s see how we design
new interaction methods, that apply chart
recommendation to facilitate more rapid and
systematic exploration. To consider another
interaction model as an alternative to
manual chart creation, we draw some inspirations from Exploratory Search systems. For example, I like to watch a movie on
Netflix on weekends, but I often don’t know
which movie to watch. Netflix interfaces
provide a few mechanism to help me find a movie. First, I can browse around
to see the collection of movies that are available and
recommended by the system. Second, if I want to
see a category movie, like comedy, I can
study recommendation by using a facet navigation to fill their movies
based category. Finally, I can also
pick a movie that I liked before to get
related suggestions. Inspired from this kind
of browser interfaces, we developed the Voyager
visual session browser with the goal to have user
systematically explore more data, and avoid premature fixation. Voyager apply
user control by letting users select data fields to
steer the recommendations. Given a user selection, Voyager then suggests
plots showing transmission of the data from raw data to aggregated data. And it also showed plots suggesting one
extra field to help user discover other
potentially relevant factors. As we want to consider
recommendation browsing, as an alternative to
Manual Specification. We compare Voyager with
a Manual Specification too in a user study on
exploratory data analysis. For the Baseline Condition, we develop PoleStar, which is an interface
modeled on Tableau. With this interface using
and create charts by dragging fields to the encoding shelf
to visual encodings. And the reason that
we have to develop this tool is to make sure that the earliest
difference we’ve conditioned is
the interaction method. But all plots that
can be shown in Voyager can be similarly
created in PoleStar. For study results,
I first analyze interaction logs to see
if Voyager meets our goal, to help users systematically
explore more data. So I found out
a user interacts with one point four times more
unique field sets in Voyager, confirming that Voyager
helped user explore more data. Another part of
this analysis is that exploratory analysis involve open exploration and
question answering. So we also asked user to rate their two preference
for both tasks. For open exploration,
user prefer Voyager as browsing
is less tedious, and help them discover insight that they might
otherwise overlook. However, they prefer PoleStar
for question answering, because they have
more control to create specific charts
that they want. Qualitative feedback from
our participants also reflect the complementary nature
of Voyager and PoleStar. When user said that I
will start with Voyager, but want to go and switch to PoleStar to dive
into my questions. But once that question answered, I would like to switch
back to Voyager. Overall, the study results show that these two
interaction models have two complimentary values for supporting open exploration
and question answering. As exploratory analysis
involves both tasks, this user also called for
unified tool that provide better balance between automatic recommendation
and Manual Specification. And of course, this multiway
has to develop Voyager 2, which is the tool that brands specification and
recommendation in a single tool. To facilitate bar exploration
and question answering, Voyager 2 provide
multiple interaction methods with varying degree
of automation. We’ll be able to
directly on PoleStar, so user can create arbitrary views with
the drag and drop interfaces. Moreover, we add two new partial
specification interfaces. Based on the main
specified view, Voyager 2 show related fields
to help the user discover relevant data fields and alternative ways to
summarize or encode the data. Moreover, to give user control
over the automation, Voyager 2 lets
user directly alter wildcards to create
multiple charts in parallel. So let’s see how these three
instruction works in a demo. In this demo, we’ll use Voyager two to explore
a dataset about cars. I was also highlight
how Voyager 2 produced underlying
CompassQL queries as well. Upon loading the dataset, Voyager 2 lists all the
data fields on the left. The middle pane show encoding
shelf that user can drop view to specify visual encoding
similar to Tableau. The menu on the right, show the “Specified View” on the top. As user haven’t specified
any visual encoding, this view is initially empty. Below the “Specified View” is the “Related Views” section. Voyager 2 initially show univariate summaries
to help analysts begin by examining all fields we have the need to create are
any of this view manually. Looking at the top left plot, you can see that most cars in this dataset have
even number of cylinders. We can look at
line chart and see that this dataset
is kind of old, actually older than
me, it is from the 70s. After exploring
each field in the dataset, the analyst may
want to explore by varying relationships between
different power fields. Of course, the analyst
can manually drag and drop different power fields
to the encoding shelve. But, repeatedly
doing this can be tedious and require
the analyst to have discipline to examine
all interesting power fields. Instead, Voyager 2 provide wildcards for creating
multiple charts in parallel. Below the list of
fields on the left are the wildcard fields which can be used to encode
multiple data fields. To let the system pick
appropriate encoding mappings, user can drop any data fields or wildcard fields onto
the wildcard shelve. For example, dropping two quantitative field
wildcards to the wildcard shelve produce a gallery of scatter plots between all pairs
of quantitative fields. Under the hood, the wildcards
in the UI they are the maps to the wildcard in the CompassQL that I have
shown earlier in this talk. With this gallery,
user can easily explore relationships between all pairs
of quantitative fields. For example, we can
see that Horsepower and Miles_per_Gallon appear to
have a quadratic relationship. Basically, cars with
higher Horsepower tends to have
lower Miles_per_Gallon. To see what is
the outer line point, we can also able to
see the tool tip. And using the bookmark button, we can also take some notes about insights during
exploration as well. To further drive recommendation
based on this view, we can use the “Focus” button to promote this view
to be the main view. Clicking the “Focus” button updates both the
encoding shelves, and the specified view
on the top. Based on the specified view, Voyager 2 recommends different
kinds of related fields. The related summary section suggests alternative ways
to summarize the same data. For example, from
the scatter plot above, you can see a 2D histogram
of the same two fields. Under the hood,
Voyager 2 generates related views by inferring a query from specification
of the main view. To generate related summaries, Voyager 2 adds wildcards to transformation functions
and two marks to show different way
to summarize the data. Voyager 2 also uses this kind of similar query
inferencing technique to generate other kinds
of related view as well. Scrolling down more, we can see the field suggestion
sections which add one additional field to
the [inaudible] view to promote discovery
of relations with that we might
otherwise overlook. From the demo, we have seen how user can
use and transition between all these different
interaction methods to perform data exploration. So let us see how we
evaluate Voyager 2.>>Can I ask a question first? Seems like one of
the common things to do is transformations. Like, maybe you want
transformation on something, or maybe you want,
powered weight ratio is a thing you would want, a column that you would want. And it is easily derived
from the two of the fields. And I do not see that in there. Is it there and you just
did not talk about it?>>So, we didn’t
implement that but we can add it as a part of
the CompassQL framework.>>Is that a thing
that you could do, this same kind of suggestion?>>Yes.>>Yes, so basically, if you analyze statistics
of the data and know that it has some power relationship
then you can apply log. So you can definitely use
this kind of techniques, then surface them in
the interface, like Voyager 2. So let’s see how we
evaluate Voyager 2. Again, we conduct
a user study on Exploratory Analysis and compare Voyager 2 with PoleStar. The study design allow us to directly contrast
the effect of new specification interfaces or pure manual specification
like PoleStar, which is basically the standard tool that
we use this day. So, similar to the prior study, we analyze if Voyager 2
promote data coverage and found that users interact
with two platforms or unique field sets
in Voyager 2. We also analyze
user ratings for each tasks. For other exploration, users prefer Voyager 2 over PoleStar,
like the previous study. However, for question answering, user rate Voyager 2 as
comparable to PoleStar. And since user favor PoleStar
over the original Voyager, here you can see that Voyager 2 improve over Voyager
in this aspect. And overall, in terms of
supporting both tasks, Voyager 2 improve over both PoleStar and
original Voyager. And just by having
more features than PoleStar, many participant also expressed that Voyager 2 is
actually easier to use. One said that, “I like
that Voyager 2 show me what field to include in order to
make a specific graph. With PoleStar, I
had to do a lot of trial and error and could not express what I wanted to see.” Another says that, “I feel more confident using Voyager 2. It helped me to
learn.” To summarize, Voyager 2 present a set
of interfaces that bring chart specification and chart recommendation
in a single tool. And overall, the study shows that Voyager 2 can
facilitate both, exploration and
question answering, better than existing tools
that we compare.>>Did you happen to compare
Voyager 2 to Voyager 1?>>We didn’t do that. But we designed a study in a similar way so we can
compare with previous result. And one of the reasons is that Voyager 1 is
good in one aspect, but for question answering, it is not proven well
for PoleStar it’s an interaction mode that
was widely used in industry. So it was proven that if
you do as good as PoleStar for manual specification then
it is definitely not that. And then you can improve
over for open expression, then you can see that we
improve for both aspects.>>I get that,
right. The counter is you said Voyager 2
is better than Voyager. It is surprising
not to put the two side-by-side and have
people go, “Oh wow! This one is so
much better because you gave me extra buttons.” Or conversely, “Oh no, these extra buttons are
really heavy giving me a panic attack now that I
have had a chance to see this. I preferred it
with fewer buttons.>>So when we
designed Voyager 2, we actually feel like this is becoming more power tool
than Voyager. But, surprisingly,
like you actually said, it is easier to use. So it turns out not that bad as we worried
at the beginning.>>One of the challenges
that I have doing exploratory data
analysis is that I often see things in
the graph that are not real, that aren’t let us
say, not statistically significant if you’re comparing ours against another. Is that a field of
study in the [inaudible] not just support
open-ended exploration but also to filter based on what’s a real difference versus
what you can show at all?>>Yes. That is something
that definitely we have to consider when we work on
these kind of interfaces. For example, in the demo that we have we only show
summary by mean, but actually mean is not
always trustworthy, right? Sometimes you want
to see variants. So, one of the thing that
we can do into is trying to provide a recommendation for visualization that
also shows answers handy. Like if you want
to show error bar, or those kinds of things
but also challenge them off. Like, if we add those more
sophisticated encoding how do we explain for user, especially for novice, to make them understand the encoding
that we add to as well? So that’s definitely a challenge that we need to
do going forward.>>It seems like you
are allowing people to explore a lot of
pairs of variables. And that is nice, except, we were
chatting about that. And it is like you are doing multiple hypothesis testing
in this very bottom-up way. And allowing people
to, as Andy was saying, come to things that look like conclusions, but maybe aren’t. Or just look like conclusions
because of their noise. And while I appreciate the
ability to explore the space, I am fearful of it. And going back to your second slide of how
people approach those problems, you completely didn’t
talk about people approaching
data analysis problems by having hypothesis going in, that they want to verify. And having a top-down
process to compliment this problem in that process. Can you talk about
this kind of stuff?>>Yes, definitely. So,
there are definitely some potentially harm if people use this kind of
tool in the wrong way. Because if we talk about
Exploratory Analysis like pioneers like Two-Key actually when he defined, he says, “Well, you should distinguish
between when you have just some subset of data to do exploration
and is to make sure that, well, even if we
have hypothesis, are you sure that your data are collected in the right way? Does it have the right
distribution?” And these kind of tools actually more aiming at make
sure that, well, you do not just going to hypothesis testing
without looking if your data actually
representative or have any bias. Or even have back or error. We have seen datasets that
that use, for example, like variable like H.
We have seen datasets use 99 as now, right? So to make sure that
those those kind of things does not interfere
with analysis. But then if you want to
do Confirmatory Analysis, then ideally you should split
the test between the two. But definitely that
is a hand where no Risk Analyst that does
not do this practice right, then yes, we can play
a part in doing that, too. But that is not something
that this kind of tool do. Because tools like GPlot or Tableau already play a part
in that program already. So, maybe one way is part is like educating
people to do the right thing. So, that is what I would say. You are giving
novices sharp knives. They’re going to cut
themselves or others.>>Yes, that’s very true
for other tools we’ve seen. For example, I talked
to my boss, Tom, who created D3, which is a very popular, which
is station two. And he’s so proud
that he enabled a lot like so many
sophisticated visualization that we see on
the web these days. At the same time, he say that one of
deep guilt that he has, he enabled so many
people to create chat junks on the Internet so, obviously, we offered
to the trade off. Yes.>>It’s unclear that
the right answer to, “Do you make a lot
of sharp knives?”, is other people make a lot
of sharp knives too though?>>If we can provide
some way to guide people to do things right in the
two, that’s kind of nice. In a way, we think about
tools like Tableau. Oftentimes, people
use it to jump into question without
looking at other things. What we tried to do
here is writable, like guide, that you should
do other things but then, there’s another level
of guidance we have to do to make sure that, “Hey,” but don’t use this as a way to confirm
your hypothesis.>>Right. When I
was trying to adjust about it last week using that, in the old days, this was the
hypothesis formation, and then there was a latter stage where you
look at the rest of the data, or you design a study after the fact to validate
that hypothesis. But, again, people don’t seem to be following that
so much these days.>>We jumped to conclusions?>>Yes, or signal data?>>Yes.>>Taking a step back to
see a bigger picture here, I have contributed foundation for Chart
Recommendation including the Vega-Lite Grammar
for representing charts and the CompassQL Recommendation Engine
and Query Language. And based on these foundations, I have build the Voyager
and Voyager 2 interfaces to enable new form of Visual Data Exploration
and Chart Recommendation. Besides supportiing
data exploration, this system has also served as a foundation for new applications
and research project. Earlier, we have seen Vega-Lite being used in data science with JupyterLab and a wrapper
like Altair in Python. In an ongoing work, we are collaborating with
the Jupyter team to integrate Voyager 2 as an extension
for JupyterLab so we can help user ease
their transition between data exploration and
other analysis activities in a single environment. Besides Insight
application, each of these two also enable
new research projects. For example, we have built
a model for Automated Reasoning about
Visualization Similarity and Sequencing on
top of Vega-Lite. Our colleagues at Georgia Tech and Stanford are working on different national
language interfaces for Data Visualization and Analysis based on Vega-Lite as well. CompassQL is also used to generate training
data for system that reverse engineer visual encodings from
bitmap chart images. Voyager has also
been extended to study other issue
in data analysis. For example, my colleague
at UW Tableau , Sue Kim, come here to
MSR all the summer, and actually extend
Voyager 2 with many people here to study free-form
annotation in data exploration. So far, in the first
part of the talk, I have shown the platform and
interfaces that allow us to balance between automation and user control for
data exploration, and even support
some new applications. In the second part of the talk, I will now show how I combine automatic
layout algorithm with user input to visualize with dataflow graph of deep
learning model in TensorFlow. Back a few years
ago, Google released the TensorFlow open-source
library to make it easier to implement and
deploy deep learning models. TensorFlow is basically a set of high-level APIs that user can use to generate
low-level dataflow graph, and this kind of
low-level dataflow graph has many benefits in model
implementation and deployment. However, when developer
works with the code, trying to understand
the network architecture from the code can
be challenging. In practice, the first
often use diagrams to help them understand and share high-level structure
of their models. Whenever there’s a new paper
of a newer model, we will see a diagram explaining the
modular architecture. At work, developers also
draw diagrams as a way to understand existing code base or help explain their models
to other developers. Since diagrams are
critical to their work, developers design it to draw
this diagram automatically. Of course, before
building any tool, the first thing to try is to use standard graph layout like Graphviz to visualize the graph. However, TensorFlow
graph contain thousands of
low-level operations, plus the output layout
looks pretty complex. You cannot even see
the ellipses for the node here. This is actually
a Hello example, but it looked way
more complicated than advance network diagram
that we saw earlier. To decide, we
should stay it too, that much developers need, then we work closely with deep learning researchers
and developers at Google to understand the
difference between the TensorFlow graph and the diagrams that
they normally draw. First, TensorFlow graph contains low-level operation without
any explicit groupings. Meanwhile, the
expert diagrams show high-level structures
between layers or a group of operations. Moreover, the TensorFlow graph are cluttered because there are some high-degree nodes
that are not even important like locking
because when you lock data, you connect to
our layer in the graph. That’s why caused
this spaghetti havoc. In contrast,
the expert diagram don’t even include this bookkeeping
operations at all. Based on these insights, we create the TensileFlow
Graph Visualizer to convert this low-level, therefore graph into
diagrams that show high-level structures like the diagram that experts don’t. And we use strategies to
produce this high-level diagram. First, we apply techniques from visitation community
like building a hierarchical cluster graph, so we can show
high-level overview and let you expand the graph to
see its internal structure. The tricky part for
this technique, though, is how to provide
node clustering, that most users mental map. Since it’s actually
impossible to infer that from the graph
topology alone, here, we balance between automation
and user control by letting users specify a name hierarchies
in their resource code. The second strategy is to access an important node
to the coder, the graph, so we designed a set of heuristics to
extract high-degree, an important nodes
but then still give you the control to
customize the extraction. So, let’s see how the tool
works by trying to explore, and understand
the difference between layers in our
convolution network for image classification. Here we see the interface, so the graph is visualizer is released as
a part of TensorFlow, which is the official
visualization tool that shift that TensorFlow, and here is a “Graph” tab. And the visualization shows
the graph into two parts. The main graph on the left, show the model’s Core flow. Meanwhile, the right
part of the graph show unimportant nodes that are extracted to
de-clutter the graph. So, let’s zoom into
the main graph. We can see many
rounded rectangles which represent groups of operation that perform
certain functions, like convolution layers
and fully connected layers. Meanwhile, smaller ellipses represents individual
operations. We can see that
the two convolution layers here have the same color. This means that they have
identical internal structure. So, you still don’t
have to expand out there manually so you can
see the similarity. We can expand the one of the layer by double-clicking
and expanding. And here we can see that the model has a 2D
convolution operation, to combine input data
with weights. And the output from
this operation is then add the bias variable, and then passed to
the Rectified Linear Unit, which is an activation function. If we go up and expand one of the fully connected
layer instead, we can see similar weights
and bias variable inside. However, here the weights
are at par using a matrix multiplication instead of a convolution operation. And this is
the difference between fully connected layer
and convolution layer. Using this kind of
hierarchical graph, we can see high-level
structure of the model, but still have
the ability to dive in and explore details on demand. One question that you
may have though is that, do we really have to extract unimportant nodes to the side? And here’s a visual
evidence that we do because if you don’t, well, are these high-degree
unimportant nodes like, error reporting, calculating gradients
make the layout looks like spaghetti hair ball. And by
extracting them to the side, then we get a nice diagram that show the Core flow
of the model. But users can still add things back to the main graph
if they want. I’m not going to
show in this audio. So, we release the tool
as a part of TensorFlow, and after the release, we have seen
some key usage patterns. The most common usage pattern
is to inspect models. For example, one of
the user used the tool to verify if the code
produced what we intended. Another used the tool
to find a name of a node so that they can
do further exploration, like using other pair
of Tensor bar to the evolution of
a particular input. People also use screenshots from our tool to explain
their models. Like in TensorFlow’s
Official Tutorial, in Third Party Articles, and when people ask
questions on StackOverflow. Many book and video
tutorials also suggests a best practice
for developers to iteratively rename
their nodes until their visualizations match
their mental model especially, when they share their model
with other colleagues. Well, we know that developers normally don’t like
to change their code, the fact that they are
willing to do it to get better visualization
clearly shows the value that they can get from this kind
of visualization. And finally, a common
public feedback is that, model visualization is
a key feature of TensorFlow. Like a comment on
Quora says that, “Almost all open source
machine learning packages lack the ability to
visualize the model and follow the
computation pipeline.” Another book so says that, “visualization is fundamental to the creative process and our ability to
develop better models. So, visualization tools
like TensorBoard are a great step into
the right direction.” To summarize, in store, I have shown you how I add automation to
visualization tools in critical domain including
data exploration and understand even deep
learning model in TensorFlow. As I mentioned earlier, the main challenge for adding
automation for this tool, is to balance between
automation and user control. So, let’s revisit how we achieve this balance
in each system. For Vega-Lite, we deal with sensible defaults for details that analysts normally
wouldn’t care about, but when they do, we let them
all write these defaults. For Voyager and CompassQL, we embed expert design and analysis knowledge in
the form of constraints, rankings and
recommendation types. But we still let users their recommendations by
providing partial specification. In TensorFlow graph, we openly apply layout technique like hierarchical graph clustering, but still let user control all the hierarchy by satisfying name hierarchy
in their source. Going forward, I would like to continue my research
mission to help people work with and
benefit from data via the design of visualization
and intelligent systems. One area I’m excited is to
develop new application of visualization
recommendation based on the platform that we have built. And some of these
application will drive for their research question in visualization and
recommendation as well. For example, imagine
if you can specify information that you want
to show up on dashboards, and the tool
automatically designed an interactive
dashboard for you. To support this system, we need to work on
a number of challenges. First, what should be
the query interface and language for
specifying user intent? Moreover, in
visualization research, we have been mostly focusing on the effectiveness of a single
chart and other techniques. But to suggest a dashboard, we have to study and design new constraint
between multiple wheels, such as the consistency between visual encodings in
subfields of a dashboard. And finally, as we go from single static chart to
an interactive dashboard, the space of possible
design grow exponentially. How might we design
a ranking that can cover all this design space? Encompassing how we
kind of pre-train the model and have
the ranking upfront, but to scale into this space, I would like to explore
a probabilistic model that can learn over time, so we can learn from user interaction instead of having to develop
the whole model up front. Another research area
that I’m excited to work on is building interactive tool to support the development
and understanding of machine learning systems. Tooth-like graph visualization
in TensorFlow has lowered the difficulty to author and understand structure
of the model. However, users still
have to rely on textual code to alter visual concepts like
neural networks. When there are extension from
this work would be allowing developers to author networks by directly manipulating
the visualization. Besides easing creation,
you should also ensure that people
understand this system too, especially if you are relying more and more on
machine learning systems. For example, we have seen bad instance that
machine learning has been used to predict
future criminals. But it turns out that
the model is biased against African-Americans
because the training data that we put into
the system is biased. This is an important reminder
that the quality of the model largely depends on the data that we
feed into the system. Basically, garbage
in garbage out. To bridge this kind
of biases developers need better tools to
inspect the training data. And even bigger question is, how can we help people analyze and understand
model behavior? And this is not just a problem for machine learning developers. Often times, other
stakeholders in product development
like product manager, and designer, or so, or even customers
want to understand behavior of that model as well. How have we designed
to support the needs of these stakeholders in
the development process? To achieve this goal, I don’t think it’s something
that I can answer alone. But I hope to
collaborate closely with other people in HCI, and also machine
learning experts, and also other stakeholders to understand their needs
and best practice. So, we can design tools
to support their need, and enhance this tool with
automation to aid experts, and so kind of
basis. Again, to add automation to this tool, we have to find
the right balance between automation
and user control. Lastly, I would like to say
that I think research is team sport and collaboration
is key to making big impact. I couldn’t have done
all this work alone. So I would like to thank
my mentors and colleagues at UW Tableau and Google
especially to my adviser, Jeff, and to my lab mates,
Dom and Irwin, who co-author our system
you see in this talk especially on the visualisation
and analysis side. And also thank my co-authors of the TensorFlow Graph
Visualization Project at Google. And lastly, I’m
also grateful that I get a chance to
mentor a number of talented undergrad
research assistants over the year and worked with many wonderful collaborators in the Open-Source community. With that, I’m happy
to take more questions.>>Pardon me because this is not my main theory
in visualization, but I see you try to do the recommendation system
from the big chain of data, which is too much to
look through, right? And eventually you
found that, “Hey, recommendation
maybe is too hard.” That’s why you
have some button in Voyager & Voyager 2
so that you say again, “It differ to what you want.” And you want to improve
your recommendations system. Eventually, maybe that will reduce task to switch
the button, right? But it looks like what
you are having in there, you want something like personalization that user don’t have to click anything at all.>>So, I don’t think
we connect into direction that user don’t have to click anything
at all because, in analysis, what’s
the goal of the two? The two is to support P holes, to leverage that domain
knowledge and intuition, which is hard to perfectly
put that in the two. But in terms of having
personalization, that’s a part where we
can make the interaction between automated
agent and user to collaborate more
tightly so that’s what we want to add but it’s
not like we want to make our analysis automated. That’s definitely something,
at least in the near future, that users still have more flexibly than
AI system in terms of contributing to
the analysis goal.>>Yes, so that’s the part that I’m actually curious about. Let’s say you saw
the same program over and over, I want to know what should I order next for
my company or whatsoever. If you saw this a million times, you saw data from a million
companies, maybe eventually, you can derive some
reinforcement learning system that can answer
this specific question, right?>>Yes. If we get really
batch set of data to analyze, in ideal situation,
we may get there but the thing is this is
a very exploratory task, like every single data set have different aspect
that you have to explore. Even the same data set you
give to different analysts, they can use slightly
different exploration, depends on their interest
or prior domain knowledge. Before we get there, I think you have to provide some way for users to
control the system too. Or, even in
other high application like talking about
self-driving car, obviously, we’ve learned
about them too much. Then, you would
see one that gives some control so that
if there’s something that’s going to
happen suppose like self-driving car cause
a traffic jam, is it the companies that built the cars fault or the one
that’s on the cars fault? If it’s still the latter, you still want to have
some control over automation. I think broadly
not just visualization tool when we design
an automation system, you still want some degree
of automation in order to facilitate real usage.>>All right, makes sense.>>You answered those questions earlier about some future
directions to go with Voyager might be recommendations for data transformations
or guidance about uncertainty so let’s say you build a hypothetical system
called Voyager 3. At that point, how
would you evaluate it? Does it still makes
sense at that point to do another head-to-head comparison
against PoleStar or Voyager 2 or at
what point do you choose another method for
evaluating these tools?>>I think, at that point, I think we have to focus more on specific aspects that
you’re improving, right? Because right now, we
don’t have, as we know, that learning user study
is time consuming. Right now, we want to
establish that this is the direction that
can be promising, but we don’t have
a chance to find you. For example, we showed a transformation and then
adding one more field, there can be other type
of recommendation like showing more uncertainty or when people add too many data
field to the visualization, we want them to step back too. So, trying to
optimize how should we provide recommendation would be something that you
have to do more at local certainty in
your system but right now, it’s like comparing
two both aspect, like this is totally
different interaction method. That’s why we are
comparing PoleStar. Going forward, definitely we have to do with
different studies.>>Would it be
a deployment study? Now, if there’s
disintegration in Jupyter, do you have the potential now
to track usage and see what the usage patterns
are or is that other direction
you’re going to go in?>>That tradeoff there
because like- Right now, we have the infrastructure
for tracking user interaction when we run user study but it’s not like we’re
going to be able to deploy that without
invading user’s privacy. Maybe, but we can do
a more compromising method, which is like if we partner with some group, they’re
going to use those. Then we can turn that on for those group because we already
get permission from them but it’s not something
that we can just deploy in a while without
invading their privacy.>>Yeah.>>Building perhaps
on his question, can I ask you to jump
back to Voyager 2 slides?>>Sure.>>Sorry about that, and specifically the
Voyager 2 example.>>Voyager 2 example like demo?>>Voyager 2 demo, yeah.>>Sure. Okay.>>Oh, lovely. Great. If you jump forward about three slides, what I want to see is like the variations of
the bivariate view?>>Oh, I see.>>Keep going. Next one is where
you start showing your recommendations for
when after you selected this.>>Yeah.>>And search over
next one. There we go. Of the related views below, which of them do you
feel are good examples of best practice
that you’ve used additional information
about visualization, which are presumably
what you’re trying to do? For example, do you
believe the thing on the bottom left here, the heat map, is what the user was looking
to learn when they said, “I want more information
about the thing up there.”?>>So different kind of related view kind of self
study different purpose. For example,
summary, we know that like if you have to do that in PoleStar, it’s kind of tedious. Actually, in the user study, even though it’s like
a training session, we see that a lot of
participant at the end, they kind of use this as a shortcut to go to the summary view because they want
to create this view. But to do this, you
have to click bin, two tabs up there and then
you have to do account and not everybody can think
that fast to do it, right? So, in some way,
summary is like, in part, providing a shortcut and in some case,
depend on the data. If your data has so many data points and
it’s kind of cut there, it’s hard to see where
is the mean or median. So, the summary is sort of
like providing a shortcut. While if you go down
to field suggestion, that’s where it’s kind
of like we want to provide a way for them to quickly click through if
there is something that->>Okay. The bottom left
is again a great example. Scatter plot with
variable-sized circles, which perception study told
you this was a good idea?>>I think that’s where I
come to the question of Mat that how to best provide
this recommendation, that’s still an area that we have to further
optimize from these two. So, there are some cases
where you really want to use this but maybe we should
not always show this, that’s right but maybe there are some cases where this is useful and we have to determine the threshold when
is this useful. Right now, this is more
like a template that, well, if I want to add categorical
field, then colors are best. If you want to add
one more quantitative field, then size is the best
option we use here. Obviously, this terminology will break when you start
having more view. And we actually try
to have some cutoff. Like, for example, if you choose that part with
three quantitative field, we no longer would
have a pane for adding the fourth quantitative field because it doesn’t
make sense anymore. This one is kinda like
at the boundary that, yes, I agree with you, a lot of time it’s probably
not useful but when it does, well, it’s not beyond the boundary that
is useful anymore. But you might also notice that there’s a reason why
we put categorical above quantitative field
here because that one seems more
generally useful. This one is like
at the boundary.>>At least, in
this case, would you generate a bend acceleration
and coloring it by that as a sort of way of converting that
qualitative field into a categorical view of that? Is there any sort of system
for doing that on there?>>Right now, we don’t do
that because we tried to do one set of change, like one type of change. In a way, if you think about space visualization
sign where you have a known single
visualization. Right now, we recommend
neighbors of those nodes. We don’t want to jump
to many step, otherwise, they feel like how do we get from current stage
to those stage.>>But those are
neighbors in the vague of like description space not necessarily user
conception space. That is to say, adding acceleration versus adding
bend acceleration are, to me, as a user, or like adding acceleration is a dark size versus bending
acceleration color size.>>That’s true. I think
then that’s hard like->>Do you think that the Voyager exploration
data process applies equally well
to other types of data? If this tabular data with quantitative and
categorical attributes, does it apply equally well to say if you have network data, a graph data, text
data, other forms?>>I think the idea of having partial ification
that you can guide the recommendation
by some metrics, I think that idea is still be applicable for
other types of analysis. But then, you have
graph data there might be certain type of operation
that you would like to do. That’s totally
different from adding mean and sum in tabular data. So, those kind of
part is going to change and you decide for
those other type of data.>>Well, if there are
no more questions, it’s been very good.

One thought on “Augmenting Visualization Tools with Automated Design & Recommendation

Leave a Reply

Your email address will not be published. Required fields are marked *