Low-level API
The nxviz
low level API is one that provides the most flexibility for constructing rational graph visualizations.
As always, with rational graph visualizations, there is a process involved that helps us compose together beautiful visualizations. We first concern ourselves with the node placement using the layout functions. Then, we concern ourselves with data-driven visual styling of the nodes. After that, we figure out how to draw edges (whether as lines or bezier curves) and style them according to data.
%config InlineBackend.figure_format = 'retina'
%load_ext autoreload
%autoreload 2
Example
As an example, let's see how we make can make customizations to the Circos plot by using the low-level API.
For some of these things, we might be able to accomplish them using the higher level API, but we will intentionally show the low-level way of handling these customizations so that you can have a feel for how you can implement low-level customizations.
from random import choice
import networkx as nx
import numpy as np
G = nx.erdos_renyi_graph(n=71, p=0.1)
for n, d in G.nodes(data=True):
G.nodes[n]["group"] = choice(["a", "b", "c"])
G.nodes[n]["value"] = np.random.exponential()
np.random.seed(44)
for u, v, d in G.edges(data=True):
G.edges[u, v]["edge_value"] = np.random.exponential()
Node and Edge Table
The node and edge tables are the low-level data structures that are used in creating network visualizations. These are pandas DataFrames.
from nxviz import utils
nt = utils.node_table(G)
nt.head()
The node table is indexed by node ID, and all of the metadata attributes are stored as columns.
et = utils.edge_table(G)
et.head()
For the edge table, the "source" and "target" columns are the node IDs in the node table. Every other column is a metadata field. The index carries no semantic meaning here.
Circos layout
Following the principles of rational graph visualization, we start by declaring the layout that we want. Since in our example we'll be using the Circos plot layout, let's start by obtaining the (x, y) coordinate positions of each node that we want to plot.
The nxviz.layouts
module contains the circos plot layout function that we'll want.
Underneath the hood, it uses pandas' group-by and sorting functionality
to get nodes into the correct order that we want.
If you wish to group and sort in a customized fashion,
then you'll have to implement the functionality yourself.
from nxviz import layouts
pos = layouts.circos(nt, group_by="group")
Node styling
Next, we concern ourselves with the styling of the nodes.
Here, the nxviz.encodings
submodule becomes useful for us.
When drawing nodes, their color, transparency, and size can be most naturally mapped to data.
- Transparency requires that a quantitative value be mappable to the interval (0, 1).
- Size requires that a quantitative value be mappable to the positive floats (0, +inf).
- Color is the trickiest of them all:
- A categorical variable should be mapped to a categorical colormap.
- A continuous variable should be mapped to a continuous colormap.
The choice of colormap is always going to be dependent on the user. If you're looking for a guide on how to choose colormaps, the Points of View guide to colors is a very good resource to start with.
How do we handle styling of nodes? The primary way of doing so is to have a Python function that maps from the node table's column of values (passed in as a pandas Series) to any color specification that matplotlib can handle:
- Strings: "black", "yellow", "blue", etc.
- RGB(A):
(0.1, 0.8, 0.3, 0.5)
- Hexadecimal:
#FFFFFF
,#000000
,#A7C91F
Here's two examples, one using a highly custom mapping, and the other using matplotlib's color maps.
import matplotlib.pyplot as plt
import pandas as pd
from nxviz import encodings as aes
def group_colormap(data: pd.Series):
cmap = {"a": "black", "b": "blue", "c": "red"}
return data.apply(lambda x: cmap.get(x))
def value_colormap(data: pd.Series):
"""Value colormap."""
norm = plt.cm.Normalize(vmin=data.min(), vmax=data.max())
cmap = plt.cm.get_cmap("viridis")
return data.apply(lambda x: cmap(norm(x)))
def node_size(data: pd.Series):
return data.apply(np.sqrt)
We can now combine everything together, into something that basically reconstructs
nodes.draw
.
import inspect
from nxviz import nodes, plots
print(inspect.getsource(nodes.draw))
ax = plt.gca()
# Customize node styling
nt = utils.node_table(G)
pos = layouts.circos(nt, group_by="group", sort_by="value")
node_color = group_colormap(nt["group"])
alpha = nodes.transparency(nt, alpha_by=None)
size = nodes.node_size(nt, "value")
patches = nodes.node_glyphs(
nt, pos, node_color=node_color, alpha=alpha, size=size
)
for patch in patches:
ax.add_patch(patch)
plots.rescale(G)
plots.aspect_equal()
Voila! We now have a sonic hedgehog-style node layout! Pretty cool, isn't it?
Adding in edges
Once the node layout is complete, customizing the edge styling is a matter of concerning ourselves with:
- Line width
- Transparency
- Color
We could customize more, but these three are the most commonly-used for mapping data to style.
As with node layouts, we basically have to re-create nxviz.edges.draw
with customized data-to-style mapping functions.
from nxviz import edges
print(inspect.getsource(edges.draw))
from nxviz import lines
# Customize node styling
ax = plt.gca()
nt = utils.node_table(G)
pos = layouts.circos(nt, group_by="group", sort_by="value")
node_color = group_colormap(nt["group"])
alpha = nodes.transparency(nt, alpha_by=None)
size = nodes.node_size(nt, "value")
patches = nodes.node_glyphs(
nt, pos, node_color=node_color, alpha=alpha, size=size
)
for patch in patches:
ax.add_patch(patch)
# Customize edge styling
et = utils.edge_table(G)
edge_color = edges.edge_colors(et, nt=None, color_by=None, node_color_by=None)
lw = np.sqrt(et["edge_value"])
alpha = edges.transparency(et, alpha_by=None)
patches = lines.circos(
et, pos, edge_color=edge_color, alpha=alpha, lw=lw, aes_kw={"fc": "none"}
)
for patch in patches:
ax.add_patch(patch)
plots.rescale(G)
plots.aspect_equal()
plots.despine()
Looking at the plot, we might find that expressing the edges' edge_value
as line width might not be that effective. Instead, we might want to express it using alpha.
from nxviz import lines
# Customize node styling
ax = plt.gca()
nt = utils.node_table(G)
pos = layouts.circos(nt, group_by="group", sort_by="value")
node_color = group_colormap(nt["group"])
alpha = nodes.transparency(nt, alpha_by=None)
size = nodes.node_size(nt, "value")
patches = nodes.node_glyphs(
nt, pos, node_color=node_color, alpha=alpha, size=size
)
for patch in patches:
ax.add_patch(patch)
# Customize edge styling
et = utils.edge_table(G)
edge_color = edges.edge_colors(et, nt=None, color_by=None, node_color_by=None)
lw = edges.line_width(et, lw_by=None)
alpha = edges.transparency(et, alpha_by="edge_value")
patches = lines.circos(
et, pos, edge_color=edge_color, alpha=alpha, lw=lw, aes_kw={"fc": "none"}
)
for patch in patches:
ax.add_patch(patch)
plots.rescale(G)
plots.aspect_equal()
plots.despine()
Looking at this plot, it's a lot easier for us to see the important edges (as visualized by the alpha value).
Conclusion
Throughout this notebook, we dropped down from the mid-level API to the low-level API, where we get to customize node and edge styling to our heart's content. The patterns are easy to follow. For nodes, we customize the size, color and transparency. For edges we customize the line width, color, and transparency. We can then compose them together into the plots we see above.