CMSC 27100 — Lecture 10

The notes for this course began from a series originally written by Tim Ng, with extensions by David Cash and Robert Rand. I have modified them to follow our course.

Introduction to Graph Theory

Our last major topic in CMSC 27100 is graph theory. By graphs, we mean mathematical objects that consist of nodes connected by edges; You can scroll down to see some visualizations of graphs. Note that these graphs have nothing to do with the graphs you saw in calculus: It is a just a collision of names.

It is hard to overstate how important graph theory is to so many varied disciplines. As a computer scientist you will use graphs to model all sorts of abstract relationships in data structures, and use graphs as intermediate tools in solving algorithmic problems. In order to give some examples, let's fix the definition of a graph.

A graph $G = (V,E)$ is a pair of sets with $V$ finite, and every element of $E$ a subset of $V$ of size $2$. We refer to the elements of $V$ as the vertices of $G$ and the elements of $E$ as the edges of $G$. When $e=\{u,v\}$ is an edge, we will usually write $e=uv$ instead of $\{u,v\}$. We define the empty graph to be $G= (\emptyset,\emptyset)$.

This definition is presented for technical precision. In practice, graphs and statements about them are quite intuitive, and we won't appeal to the precise definition that often. Here is a visualization of a graph:

This graph is called "the" Petersen graph (there are actually many such graphs represented by this picture; we'll discuss why we refer to it as "the" Petersen graph anyway). The Petersen graph is a particularly interesting graph, not just because it looks neat, but because it happens to be a common counterexample for many results in graph theory.

Graphs can represent relatively obvious relationships, like friendships in social networks and how computers are connected. In machine learning, graphs are used to describe how a neural network is structured. One of my favorite applications is to compilers: To decide how to store variables in registers, compilers create graphs where the variables are nodes and edges indicate that two variables are "live" at the same time (and thus can't be put into the same register).

A few comments about Definition 10.1 are in order. The objects defined here are called finite simple undirected graphs in other contexts.

  1. They are finite because we require that $V$ is finite. In other contexts $V$ is allowed to be infinite.
  2. They are simple because we disallow self-loops (i.e. edges of form the $e=vv$) and we do not allow "multiple edges" in a graph. Self loops are disallowed because we define edges to subsets of size exactly 2. Multiple edges are disallowed because $E$ is a set and can contain an edge $e$ at most once (by our definition of a set).
  3. They are undirected because within an edge $e=\{u,v\}$ there is no notion of "direction". In the graph above this is represented by the fact that we just drew lines with out arrow tips. A directed graph, which we don't have time to cover in detail, has directed edges, which are ordered pairs $e=(u,v)$. Diagrams for directed graphs have arrows to indicate the direction of the edge.

All graphs in this class will obey our definition. It is good practice for authors to clarify which flavor of graph they allow - conventions vary!

Graph theory has a lot of definitions and terminology, but it is mostly intuitive, easy to remember, and even charming.

Two vertices $u$ and $v$ of a graph $G$ are adjacent or are neighbors if $uv\in E$. We say the edge $e$ is incident to $u$ if $u\in e$. If $e=uv$ is an edge, we refer to $u$ and $v$ as the ends or endpoints of $e$.

The neighbourhood of a vertex $v$, denoted $N(v)$, is the set of neighbors of $v$. In symbols, this is \[ N(v) = \{u \in V \mid uv \in E\}. \] The neighborhood of a set of vertices $A \subseteq V$ is defined to be \[ N(A) = \bigcup_{v \in A} N(v)\setminus A. \] That is, $N(A)$ consists of all vertices that are adjacent to some vertex in $A$, but not in $A$ themselves.

The degree of a vertex $v$ in $G$, denoted $\deg(v)$ or $d(v)$, is the number of its neighbors. A graph $G$ is $k$-regular if every vertex in $G$ has degree $k$.

Observe that every vertex in the Petersen graph has degree 3, so the Petersen graph is 3-regular, or cubic.

The following results are some of the first graph theory results, proved by Leonhard Euler in 1736.

Let $G = (V,E)$ be an graph. Then $$\sum_{v \in V} \deg(v) = 2 \cdot |E|.$$

This proof is usually presented very quickly, and indeed it is true for a simple reason: Every edge $uv$ is incident to exactly two vertices: $u$ and $v$. Therefore, each edge contributes 2 to the sum of the degrees of the vertices in the graph.

In case that is too fast, here is a more detailed proof. Define a function $f: V \times E \to \{0,1\}$ by $$ f(v,e) = \begin{cases} 1 & \text{if $v$ is incident with $e$} \\ 0 & \text{otherwise} \end{cases}. $$ Now consider the sum $$ D = \sum_{v\in V}\sum_{e\in E} f(v,e). $$ On the one hand, we have $$ D = \sum_{v\in V}\deg(v) $$ since the inner sum counts the edges incident at $v$, which is the same as the degree of $v$. On the other hand, we can reverse order of the sums to get $$ D = \sum_{e\in E}\sum_{v\in V} f(v,e). $$ The inner sum is always equal to $2$, since every edge is incident at exactly two vertices. Therefore $$ D = \sum_{e\in E} 2 = 2|E|, $$ proving the theorem.

Let $G$ be a graph. Then the number of vertices of odd degree in $G$ is even.

Let $V_1, V_2 \subseteq V$ be the sets of odd and even vertices of $G$, respectively. Then $$2|E| = \sum_{v \in V_1} \deg(v) + \sum_{v \in V_2} \deg(v).$$ We observe that since $\deg(v)$ for $v \in V_2$ is even, then the sum $\sum_{v \in V_2} \deg(v)$ is even. Then since $2|E|$ is even, this must mean that the sum $\sum_{v \in V_1} \deg(v)$ is even. But since $\deg(v)$ is odd for $v \in V_1$, this implies that $|V_1|$ must be even.

It's the corollary that gives the name for the handshaking lemma. When rephrased as a bunch of people shaking hands, the lemma says that there must be an even number of people who have shaken an odd number of hands.

Some Special Graphs

We now define some notation for certain graphs that arise frequently.

A complete graph on $n$ vertices is a graph $K_n = (V,E)$, where $|V|=n$ and $$E = \{uv \mid u,v \in V\}.$$ That is, all vertices are neighbours with each other.

Here are some complete graphs:

A path on $n$ vertices is a graph $P_n = (V,E)$ with $n\geq 1$ distinct vertices $V = \{v_1,v_2,\ldots,v_n\}$ and \[ E = \{v_1v_2,\ v_2v_3,\ \ldots,\ v_{n-1}v_n\}. \]

Under this definition, we allow a graph consisting of a lone vertex with no edges to be a path. Here are some other graphs that are paths:

A cycle on $n$ vertices is a graph $C_n = (V,E)$ with $n\geq 3$ distinct vertices $V = \{v_1,v_2,\ldots,v_n\}$ and \[ E = \{v_1v_2,\ v_2v_3,\ \ldots,\ v_{n-1}v_n,\ v_nv_1\}. \]

And here are some graphs that are cycles:

The $n$-(hyper)cube graph is the graph $Q_n = (V,E)$, where $V = \{0,1\}^n$ (i.e., binary strings of length $n$) and $uv\in E$ if and only if $u$ and $v$ differ in exactly one bit.

It's called a cube (or hypercube) because one of the ways to draw it is to arrange the vertices and edges so that they form the $n$th-dimensional cube.

Subgraphs

It will frequently be useful to speak of one graph "containing" another graph. For instance, we could describe a graph with a triangle as "containing a $K_3$". The notion of a subgraph makes this precise.

A graph $G' = (V',E')$ is a subgraph of a graph $G = (V,E)$ if $V' \subseteq V$ and $E' \subseteq E$. A subgraph $G'$ is spanning if $V'=V$.

Note that because we require that $G'$ itself be a graph, $E'$ must consist of edges with both ends in $V'$.

We will also use the concept of an induced subgraph.

The subgraph induced by a subset $V' \subseteq V$ is the graph $G'=(V',E')$, where $E' \subseteq E$ consists of every edge with both ends in $V'$. We say a subgraph $G'$ is an induced subgraph of $G$ if it is induced by some subset of $V$.

A subgraph being induced by $V'$ means that it takes all of the edges from $G$ that go between its vertices. For example, the graph induced by $V'=V$ is always the original graph.

Walks, Paths, and Cycles

A natural impulse when presented with a bunch of circles joined by lines is to see if we can get from one circle to another by following the lines. More formally, when we do this, what we're doing is checking if an object is connected to another object. It's not surprising to learn that this is a fundamental property of graphs that we'd like to work with.

A walk is a sequence of at least $2$ vertices $v_0,v_1,\ldots,v_k$ such that $v_{i-1}$ and $v_i$ are adjacent for $1\leq i \leq k$. The length of a walk $v_0,v_1,\ldots,v_k$ is $k$. A walk is closed if $v_0=v_k$, i.e. it starts and ends at the same vertex.

Note that a walk of length $k$ consists of $k+1$ vertices with $k$ "hops" between them.

Walks are different from paths and cycles that we defined earlier. Walks are sequences of vertices and are not graphs. You can think of them as "histories" for how someone might wander around a graph and write down the vertices they visited. If a vertex is visited multiple times then it appears in the walk multiple times. On the other hand, paths and cycles are graphs; they consist of a set of vertices and a set of edges.

Walks are in some sense simpler to manipulate: We can for example concatenate walks to produce new walks (assuming the first ends where the second begins). This isn't true of paths, since the result of putting two paths together could be something like a cycle.

Note that the above definitions are not universal and you should be very careful when consulting other materials—in particular, the Rosen textbook uses "path" to mean walk and "simple path" to mean path. The terminology we use here is more standard.