Network Analysis Made Simple
ODSC Online Training
I had to fill out this form
Here are my responses
Have you ever wondered about how those data scientists at Facebook and LinkedIn make friend recommendations? Or how epidemiologists track down patient zero in an outbreak? If so, then this tutorial is for you. In this tutorial, we will use a variety of datasets to help you understand the fundamentals of network thinking, with a particular focus on constructing, summarizing, and visualizing complex networks.
By the end of this tutorial, you will learn how to:
Part 1: Introduction (30 min)
Part 2: Hubs and Paths (40 min)
Part 3: Cliques, Triangles & Structures (40 min)
Part 4: Bipartite Graphs (30 min)
Part 5: Linear Algebra and Graphs (40 min)
Short examples of real-world data science applications/use-cases this course could be useful for
Recommender systems: Using graph structures to recommend products or professional connections.
Epidemiological analysis: Figure out the most important spreaders of a disease.
Logistics: Identify the most efficient path to move goods and services.
If you're familiar with the Jupyter notebook/lab interface, are comfortable with Python programming (loops, functions, conditionals), and know how to make plots in matplotlib, you'll be well-prepared for the tutorial!
PyCon 2017 Network Analysis Made Simple
Have you ever wondered about how those data scientists at Facebook and LinkedIn make friend recommendations? Or how epidemiologists track down patient zero in an outbreak? If so, then this tutorial is for you. In this tutorial, we will use a variety of datasets to help you understand the fundamentals of network thinking, with a particular focus on constructing, summarizing, and visualizing complex networks.
This tutorial is for Pythonistas who want to understand relationship problems - as in, data problems that involve relationships between entities. Participants should already have a grasp of for loops and basic Python data structures (lists, tuples and dictionaries). By the end of the tutorial, participants will have learned how to use the NetworkX package in the Jupyter environment, and will become comfortable in visualizing large networks using Circos plots. Other plots will be introduced as well.
Part 1: Introduction (30 min)
Part 2: Hubs and Paths (40 min)
Part 3: Cliques, Triangles & Structures (40 min)
Part 4: Advanced Network Visualizations (40 min)
Part 5: Bipartite Graphs (30 min)
Top trading cycle
This is an example of a problem that is solvable using graphs!
I learned this from Sid Ravinutala, a data scientist who works for ID Insight.
Part of the algorithm includes bipartite graphs and their projections to one of the partitions, cycle-finding, and others.
This should go inside Network Analysis Made Simple as one of the case study chapters.
Network science
A collection of my thoughts on Network science and visualization.
Things I've made:
A thought: Use position, order, and color in graph visualizations.
SciPy 2019 Network Analysis Made Simple
Title: Network Analysis Made Simple
Abstract:
Through the use of NetworkX's API, tutorial participants will learn about the basics of graph theory and its use in applied network science. Starting with a computationally-oriented definition of a graph and its associated methods, we will build out into progressively more advanced concepts (path and structure finding, and graph theory's relation to linear algebra), as well as an overview of scalable alternatives to NetworkX.
Keywords:
- data science
- network analysis
- network science
- graph theory
- graph analytics
Tutorial Topic: Applied Network Science
Detailed Description of Tutorial:
Content Updates from Prior Years:
- Updated last section to highlight the use of linear algebra in applied network science, particularly how linear algebra is used in deep learning applications, and how linear algebra can be used to accelerate certain graph operations.
- We will also discuss cuGraph as one possible alternative for scalability.
Outline:
Part 1: Introduction (30 min)
- Networks of all kinds: biological, transportation.
- Representation of networks, NetworkX data structures
- Basic quick-and-dirty visualizations
Part 2: Hubs and Paths (40 min)
- Finding important nodes; applications
- Pathfinding algorithms and their applications
- Hands-on: implementing path-finding algorithms
- Visualize degree and betweenness centrality distributions
Part 3: Cliques, Triangles & Structures (40 min)
- Definition of cliques
- Triangles as the simplest complex clique, applications
- Using path-finding algorithms to find structures in a graph
- Open triangles as recommender systems.
Part 4: Bipartite Graphs (30 min)
- Definition of bipartite graphs, applications
- Constructing bipartite graphs in NetworkX
- Summary statistics of bipartite graphs
Part 5: Linear Algebra and Graphs (40 min)
- Graphs as matrices: adjacency and node feature matrices
- Message passing operations and how it is used in graph deep learning
- Speed vs. code readability tradeoffs when using matrix operations
Student’s Python Knowledge Level: Intermediate
Short Bio + Teaching Experience:
Eric is a data scientist at the Novartis Institutes for Biomedical Research. There, he conducts biomedical data science research, with a focus on using Bayesian statistical methods in the service of making medicines for patients. Prior to Novartis, he was an Insight Health Data Fellow in the summer of 2017, and defended his doctoral thesis in the spring of 2017.
Eric is also an open source software developer, and has led the development of nxviz, a visualization package for NetworkX, and pyjanitor, a clean API for cleaning data in Python. In addition, he has made contributions to a range of open source tools, including PyMC3, matplotlib, bokeh, and CuPy.
His personal life motto is found in the Luke 12:48.
This tutorial has been taught at prior SciPy, PyData, and PyCon conferences.
- PyCon 2018: https://www.youtube.com/watch?v=HkbMUrgzwMs
- SciPy 2018: https://www.youtube.com/watch?v=K5xiFDClgjo
- PyData NYC 2015: https://www.youtube.com/watch?v=wcrwASR5DCQ
Setup Instructions:
All setup instructions can be found on the GitHub repository: https://github.com/ericmjl/Network-Analysis-Made-Simple
For those who do not want to fiddle with setup, a pre-built Binder is available.
Skills Needed:
NumPy
Matplotlib
Summary for Topic:
Have you ever wondered about how those data scientists at Facebook and LinkedIn make friend recommendations? Or how epidemiologists track down patient zero in an outbreak? If so, then this tutorial is for you. In this tutorial, we will use a variety of datasets to help you understand the fundamentals of network thinking, with a particular focus on constructing, summarizing, and visualizing complex networks. Finally, at the end of the tutorial, allow yourself to be entertained and hopefully also enlightened by the link between applied network science and linear algebra!