Network Analysis Made Simple

Key Information

Conference Proposals

New content to include

ODSC Online Training

I had to fill out this form

Here are my responses

  • Bio: pasted a link to my professional bio
  • Approx. Course Duration: 4 hrs
  • Course Title: Network Analysis Made Simple

Abstract

Have you ever wondered about how those data scientists at Facebook and LinkedIn make friend recommendations? Or how epidemiologists track down patient zero in an outbreak? If so, then this tutorial is for you. In this tutorial, we will use a variety of datasets to help you understand the fundamentals of network thinking, with a particular focus on constructing, summarizing, and visualizing complex networks.

Learning Objectives

By the end of this tutorial, you will learn how to:

  1. Use the NetworkX package and the Python programming language to manipulate and visualize graphs,
  2. Understand how graph algorithms work, particularly how to "think on" graphs,
  3. Use linear algebra to represent graph problems and speed them up,
  4. Load graph data to and from disk.

Course Outline

Part 1: Introduction (30 min)

  • Networks of all kinds: biological, transportation.
  • Representation of networks, NetworkX data structures
  • Basic quick-and-dirty visualizations

Part 2: Hubs and Paths (40 min)

  • Finding important nodes; applications
  • Pathfinding algorithms and their applications
  • Hands-on: implementing path-finding algorithms
  • Visualize degree and betweenness centrality distributions

Part 3: Cliques, Triangles & Structures (40 min)

  • Definition of cliques
  • Triangles as the simplest complex clique, applications
  • Using path-finding algorithms to find structures in a graph
  • Open triangles as recommender systems.

Part 4: Bipartite Graphs (30 min)

  • Definition of bipartite graphs, applications
  • Constructing bipartite graphs in NetworkX
  • Summary statistics of bipartite graphs

Part 5: Linear Algebra and Graphs (40 min)

  • Graphs as matrices: adjacency and node feature matrices
  • Message passing operations and how it is used in graph deep learning
  • Speed vs. code readability tradeoffs when using matrix operations

Applications

Short examples of real-world data science applications/use-cases this course could be useful for

Recommender systems: Using graph structures to recommend products or professional connections.

Epidemiological analysis: Figure out the most important spreaders of a disease.

Logistics: Identify the most efficient path to move goods and services.

Background knowledge needed

If you're familiar with the Jupyter notebook/lab interface, are comfortable with Python programming (loops, functions, conditionals), and know how to make plots in matplotlib, you'll be well-prepared for the tutorial!

Other information

PyCon 2017 Network Analysis Made Simple

Abstract

Have you ever wondered about how those data scientists at Facebook and LinkedIn make friend recommendations? Or how epidemiologists track down patient zero in an outbreak? If so, then this tutorial is for you. In this tutorial, we will use a variety of datasets to help you understand the fundamentals of network thinking, with a particular focus on constructing, summarizing, and visualizing complex networks.

Audience

This tutorial is for Pythonistas who want to understand relationship problems - as in, data problems that involve relationships between entities. Participants should already have a grasp of for loops and basic Python data structures (lists, tuples and dictionaries). By the end of the tutorial, participants will have learned how to use the NetworkX package in the Jupyter environment, and will become comfortable in visualizing large networks using Circos plots. Other plots will be introduced as well.

Outline

Part 1: Introduction (30 min)

  • Networks of all kinds: biological, transportation.
  • Representation of networks, NetworkX data structures
  • Basic quick-and-dirty visualizations

Part 2: Hubs and Paths (40 min)

  • Finding important nodes; applications
  • Pathfinding algorithms and their applications
  • Hands-on: implementing path-finding algorithms
  • Visualize degree and betweenness centrality distributions.

Part 3: Cliques, Triangles & Structures (40 min)

  • Definition of cliques
  • Triangles as the simplest complex clique, applications
  • Using path-finding algorithms to find structures in a graph.
  • Open triangles as recommender systems.

Part 4: Advanced Network Visualizations (40 min)

  • Basic concepts in rational layouts: node positioning, node colouring.
  • Plots: Circos, Arc, Hive, Matrix, Flow plots

Part 5: Bipartite Graphs (30 min)

  • Definition of bipartite graphs, applications
  • Constructing bipartite graphs in NetworkX.
  • Summary statistics of bipartite graphs
  • Double-Arc plots for visualization

Top trading cycle

Wikipedia entry

This is an example of a problem that is solvable using graphs!

I learned this from Sid Ravinutala, a data scientist who works for ID Insight.

Part of the algorithm includes bipartite graphs and their projections to one of the partitions, cycle-finding, and others.

This should go inside Network Analysis Made Simple as one of the case study chapters.

Network science

A collection of my thoughts on Network science and visualization.

Things I've made:

A thought: Use position, order, and color in graph visualizations.

SciPy 2019 Network Analysis Made Simple

Title: Network Analysis Made Simple

Abstract:
Through the use of NetworkX's API, tutorial participants will learn about the basics of graph theory and its use in applied network science. Starting with a computationally-oriented definition of a graph and its associated methods, we will build out into progressively more advanced concepts (path and structure finding, and graph theory's relation to linear algebra), as well as an overview of scalable alternatives to NetworkX.

Keywords:
- data science
- network analysis
- network science
- graph theory
- graph analytics

Tutorial Topic: Applied Network Science

Detailed Description of Tutorial:

Content Updates from Prior Years:
- Updated last section to highlight the use of linear algebra in applied network science, particularly how linear algebra is used in deep learning applications, and how linear algebra can be used to accelerate certain graph operations.
- We will also discuss cuGraph as one possible alternative for scalability.

Outline:

Part 1: Introduction (30 min)
- Networks of all kinds: biological, transportation.
- Representation of networks, NetworkX data structures
- Basic quick-and-dirty visualizations

Part 2: Hubs and Paths (40 min)
- Finding important nodes; applications
- Pathfinding algorithms and their applications
- Hands-on: implementing path-finding algorithms
- Visualize degree and betweenness centrality distributions

Part 3: Cliques, Triangles & Structures (40 min)
- Definition of cliques
- Triangles as the simplest complex clique, applications
- Using path-finding algorithms to find structures in a graph
- Open triangles as recommender systems.

Part 4: Bipartite Graphs (30 min)
- Definition of bipartite graphs, applications
- Constructing bipartite graphs in NetworkX
- Summary statistics of bipartite graphs

Part 5: Linear Algebra and Graphs (40 min)
- Graphs as matrices: adjacency and node feature matrices
- Message passing operations and how it is used in graph deep learning
- Speed vs. code readability tradeoffs when using matrix operations

Student’s Python Knowledge Level: Intermediate

Short Bio + Teaching Experience:
Eric is a data scientist at the Novartis Institutes for Biomedical Research. There, he conducts biomedical data science research, with a focus on using Bayesian statistical methods in the service of making medicines for patients. Prior to Novartis, he was an Insight Health Data Fellow in the summer of 2017, and defended his doctoral thesis in the spring of 2017.

Eric is also an open source software developer, and has led the development of nxviz, a visualization package for NetworkX, and pyjanitor, a clean API for cleaning data in Python. In addition, he has made contributions to a range of open source tools, including PyMC3, matplotlib, bokeh, and CuPy.

His personal life motto is found in the Luke 12:48.

This tutorial has been taught at prior SciPy, PyData, and PyCon conferences. 
- PyCon 2018: https://www.youtube.com/watch?v=HkbMUrgzwMs
- SciPy 2018: https://www.youtube.com/watch?v=K5xiFDClgjo
- PyData NYC 2015: https://www.youtube.com/watch?v=wcrwASR5DCQ

Setup Instructions:
All setup instructions can be found on the GitHub repository: https://github.com/ericmjl/Network-Analysis-Made-Simple

For those who do not want to fiddle with setup, a pre-built Binder is available.

Skills Needed:
NumPy
Matplotlib

Summary for Topic:
Have you ever wondered about how those data scientists at Facebook and LinkedIn make friend recommendations? Or how epidemiologists track down patient zero in an outbreak? If so, then this tutorial is for you. In this tutorial, we will use a variety of datasets to help you understand the fundamentals of network thinking, with a particular focus on constructing, summarizing, and visualizing complex networks. Finally, at the end of the tutorial, allow yourself to be entertained and hopefully also enlightened by the link between applied network science and linear algebra!