Network Analysis with igraph |
---|

igraph graphs are most often created by giving the list of the edges
to the `graph()`

function:

`>`

g1 <- graph( c( 0,1, 1,2, 2,2, 2,3 ) )`>`

g1

Vertices: 4 Edges: 4 Directed: TRUE Edges: [0] 0 -> 1 [1] 1 -> 2 [2] 2 -> 2 [3] 2 -> 3

The * directed* parameter can be set to

`FALSE`

to create undirected graphs:
`>`

g1 <- graph( c( 0,1, 1,2, 2,2, 2,3 ), directed=FALSE )`>`

g1

Vertices: 4 Edges: 4 Directed: FALSE Edges: [0] 0 -- 1 [1] 1 -- 2 [2] 2 -- 2 [3] 2 -- 3

By default the number of vertices in the graph are determined
from the edge list vector, it is the largest vertex id plus one.
The * n* parameter can be supplied to override
this. Giving a small number here than the largest vertex id plus one
in the edge list has no effect at all:

`> `

graph( c( 0,1, 1,2, 2,2, 2,3, ), n=10)

Vertices: 10 Edges: 4 Directed: TRUE Edges: [0] 0 -> 1 [1] 1 -> 2 [2] 2 -> 2 [3] 2 -> 3

`> `

graph( c( 0,1, 1,2, 2,2, 2,3, ), n=1)

Vertices: 4 Edges: 4 Directed: TRUE Edges: [0] 0 -> 1 [1] 1 -> 2 [2] 2 -> 2 [3] 2 -> 3

If you happen to have the edge list of a graph in a two-column matrix,
then you can create an igraph graph by simply giving the transposed
matrix to `graph()`

:

`> `

edgelist

[,1] [,2] [1,] 0 1 [2,] 1 2 [3,] 2 2 [4,] 2 3

`> `

graph( t(edgelist) )

Vertices: 4 Edges: 4 Directed: TRUE Edges: [0] 0 -> 1 [1] 1 -> 2 [2] 2 -> 2 [3] 2 -> 3

The “inverse” operation of `graph()`

is `get.edgelist()`

; this takes a graph object as
an argument and returns an edge list in a two-column matrix.

An A adjacency matrix is a representation of a directed graph, it is a |V|x|V| matrix, |V| being the number of vertices and A(i,j) is 1 if there is an edge from vertex i-1 to vertex j-1 and zero otherwise. This definition can be generalized to represent multiple edges, ie. A(i,j) is the number of edges from vertex i-1 to vertex j-1, and can represent undirected graphs as well.

igraph provides the `graph.adjacency()`

to create
graphs from adjacency matrices. Its only required parameter is the
adjacency matrix, and it has an optional * mode*
parameter which specifies how to interpret the adjacency matrix. Its
possible values:

`directed`

A directed graph will be created, A(i,j) gives the number of edges from vertex i-1 to vertex j-1.

`undirected`

An undirected graph will be created, the number of edges added from vertex i-1 to vertex j-1 is the maximum of A(i,j) and A(j,i).

`max`

An undirected graph will be created, the number of edges added from vertex i-1 to vertex j-1 is the maximum of A(i,j) and A(j,i). This mode is the same as

`undirected`

.`min`

An undirected graph will be created, the number of edges from vertex i-1 to vertex j-1 is the minimum if A(i,j) and A(j,i).

`upper`

An undirected graph will be created, the number of edges between vertices is given by the upper triangle of the adjacency matrix, the lower triangle is ignored. (The diagonal is considered to be part of the upper triangle so loop edges might be created.)

`lower`

An undirected graph will be created, the number of edges between vertices is given by the lower triangle of the adjacency matrix, the upper triangle is ignored. (The diagonal is considered to be part of the lower triangle so loop edges might be created.)

`plus`

An undirected graph will be created, the number of edges from vertex i-1 to vertex j-1 is given by A(i,j)+A(j,i).

Here are some examples, we use the same adjacency matrix with
different * mode* parameter.
TODO

`graph.adjacency()`

also accepts logical matrices,
the TRUE value corresponds to 1 and FALSE is interpreted as 0. This
can be used to create graph based on a *threshold*
on the A(i,j) values:

`>`

M <- matrix( runif(100), nr=10 )`>`

graph.adjacency( M >= 0.9 )

Vertices: 10 Edges: 13 Directed: TRUE Edges: [0] 0 -> 7 [1] 1 -> 1 [2] 1 -> 4 [3] 1 -> 5 [4] 3 -> 0 [5] 4 -> 0 [6] 4 -> 2 [7] 4 -> 7 [8] 6 -> 8 [9] 7 -> 4 [10] 7 -> 5 [11] 7 -> 7 [12] 9 -> 7

The “inverse” operation of
`graph.adjacency()`

is
`get.adjacency()`

, this takes a graph and returns an
adjacency matrix. For undirected graphs it also takes an optional
argument called * type*. If its value is

`both`

then a symmetric adjacency matrix is
returned, for `upper`

only the upper triangle
of the adjacency matrix will contain the data (including the
diagonal), the lower triangle is filled with
zeros. `lower`

is the opposite: the lower triangle
(including the diagonal) contains data, the upper triangle is filled
with zeros.
This section show a detailed example on how to convert your data into an igraph graph.

Suppose you're just back from your field research or you've found some nice piece of data online and you want to turn your data into an igraph graph object to play with it. Although we cannot give a general receipt which works in all cases, the following example should provide you enough guidelines to handle almost any type of data you'll find.

If your data is not in an electronic format, then you need to type it in into one or more files first. This can be done by any basic editor you are familiar with, for example notepad or wordpad on Microsoft operating systems, emacs or vi on Unix-like systems. You can also use a speadsheet program if you like but make sure that it can export CSV (comma separated value) files.

You can also type in your data directly into R, in the first example we will show you how this can be done.

If you have a graph with not too many vertices and edges, you can type it in directly into R. (This is of course possible for larger graphs as well, but we do not recommend it as text editors are generally much more comfortable.)

First we will create a data frame containing a weighted symbolic edge list, and then we will transform the data frame into an igraph graph. In case you don't know what data frames are, they are tables and generally each row in the table corresponds to an individual (or something else if you have non-social networks) and each column is a variable, a trait or observation of the individuals.

A data frame can be edited with the `edit()`

function in R, we will use this to type in our data.

`> `

data <- edit(data.frame())

Note that the appearence of the edit window may be slightly different on different operating systems or GUIs. You can resize the edit window to have a third column for the weigths or simply use the tabulator key to move to the third column. Now fill the fields with your graph data. Click on the header to set the names of the columns, these will be the names of the edge attributes. Click on the 'Quit' button if you're ready. Here is an example:

Now the `dat`

variable contains a data frame with
your data. You can create an igraph graph object from these by calling
the `graph.data.frame()`

:

`>`

g <- graph.data.frame(dat, directed=FALSE)`>`

g

Vertices: 5 Edges: 10 Directed: FALSE Edges: [0] 0 -- 1 [1] 0 -- 2 [2] 0 -- 3 [3] 0 -- 4 [4] 1 -- 2 [5] 1 -- 3 [6] 1 -- 4 [7] 2 -- 3 [8] 2 -- 4 [9] 3 -- 4

`> `

V(g)$name

[1] "Alice" "Bob" "Cecil" "Denis" "Eszter"

`> `

E(g)$weight

[1] 1 -1 1 1 -1 1 1 1 1 -1

Let us suppose that you've colledted information about the social network of a small group, say ten people. You want to store various attributes of the people and different types of relations (like friendship, business relationship, etc.) among them. You decide to store everything in a single graph, for simplicity.

The following traits are collected about the people in the experiment: name, age, gender and R will assign numeric ids to them. Here is the complete data set:

Name | Age | Gender |
---|---|---|

Alice Anderson | 48 | F |

Bob Bradford | 33 | M |

Cecil Connor | 45 | F |

David Daugher | 34 | M |

Esmeralda Escobar | 21 | F |

Frank Finley | 36 | M |

Gabi Garbo | 44 | F |

Helen Hunt | 40 | F |

Iris Irving | 25 | F |

James Jones | 47 | M |

The next table contains all the different relations among these people, some relations are binary (YES or NO, eg. the one indicating whether or not two people work in the same room), others like friendship might have values between zero and five.

Name 1 | Name 2 | Same department | Friendship | Advice |
---|---|---|---|---|

Bob | Alice | N | 4 | 4 |

Cecil | Bob | N | 5 | 5 |

Cecil | Alice | Y | 5 | 5 |

David | Alice | N | 2 | 4 |

David | Bob | N | 1 | 2 |

Esmeralda | Alice | Y | 1 | 3 |

... | ... | ... | ... | ... |

Note that we will have a directed graph because the relations are not symmetric. Ie. according to line 1 Alice asks Bob very rarely for advice, but (line 2) Bob seeks Alice's advice quite often.

Let us assume that this data is save to a file in CSV format, this
simply means that the entries in the table are separated by commas.
The first file (`traits.csv`

) contains the traits
and looks like this:

Alice Anderson, 48, F Bob Bradford, 33, M Cecil Connor, 45, F David Daugher, 34, M Esmeralda Escobar, 21, F Frank Finley, 36, M Gabi Garbo, 44, F Helen Hunt, 40, F Iris Irving, 25, F James Jones, 47, M

and the second file which contains the relations looks like this:

Bob, Alice, N, 4, 4 Cecil, Bob, N, 5, 5 Cecil, Alice, Y, 5, 5 David, Alice, N, 2, 4 David, Bob, N, 1, 2 Esmeralda, Alice, Y, 1, 3 ...

Click here for the full files: `traits.csv`

and `relations.csv`

.

Let us now import this data into R and create an igraph graph from it.
First we will import the data into data frames. (If you don't know
what data frames are, don't worry you'll still survice this section,
but it might worth to consult the R tutorial. For now the important
thing is that a data frame is like a table.) The command
`read.csv()`

can read a CSV file and stores the
data in a data frame:

`>`

traits <- read.csv("traits.csv", head=FALSE)`>`

rel <- read.csv("relations.csv", head=FALSE)

The * head* parameter is set to

`FALSE`

, this instructs R not to interpret the
first line of the file as a header. Set this to
`TRUE`

(or omit the parameter) if you have a header
with column names in your file.
We will first create an empty graph with ten vertices and the traits as vertex attributes. This can be done with a single command after loading the package:

`>`

library(igraph)`>`

g <- graph.empty()`>`

g <- add.vertices(g, nrow(traits), name=as.character(traits[,1]), age=traits[,2], gender=as.character(traits[,3]))

We can check that the graph is created properly:

`> `

V(g)$name

[1] "Alice Anderson" "Bob Bradford" "Cecil Connor" [4] "David Daugher" "Esmeralda Escobar" "Frank Finley" [7] "Gabi Garbo" "Helen Hunt" "Iris Irving" [10] "James Jones"

`> `

V(g)$age

[1] 48 33 45 34 21 36 44 40 25 47

`> `

V(g)$gender

[1] "F" "M" "F" "M" "F" "M" "F" "F" "F" "M"

Now we continue by adding the edges. First we need to cut the first names from the full names, since the edges are given with the first names only. If you want to spare this step use the same names in both files.

`>`

names <- sapply(strsplit(V(g)$name, " "), "[",1)`>`

ids <- 1:length(names)-1`>`

names(ids) <- names`>`

ids

Alice Bob Cecil David Esmeralda Frank Gabi Helen 0 1 2 3 4 5 6 7 Iris James 8 9

Then, we create an edge list with vertex ids instead of symbolic names:

`>`

from <- as.character(rel[,1])`>`

to <- as.character(rel[,2])`>`

edges <- matrix(c(ids[from], ids[to]), nc=2)`>`

edges

[,1] [,2] [1,] 1 0 [2,] 2 1 [3,] 2 0 [4,] 3 0 [5,] 3 1 [6,] 4 0 [7,] 5 0 [8,] 5 4 [9,] 6 1 [10,] 6 0 [11,] 7 0 [12,] 8 2 [13,] 8 1 [14,] 8 4 [15,] 9 0 [16,] 9 1 [17,] 9 6 [18,] 0 1 [19,] 1 2 [20,] 0 2 [21,] 0 3 [22,] 1 3 [23,] 0 4 [24,] 0 5 [25,] 4 5 [26,] 1 6 [27,] 0 6 [28,] 0 7 [29,] 2 8 [30,] 1 8 [31,] 4 8 [32,] 0 9 [33,] 1 9 [34,] 6 9

And finally we can add the edges together with their attributes:

`>`

g <- add.edges(g, t(edges), room=as.character(rel[,3]), friend=rel[,4], advice=rel[,5])`>`

g

Vertices: 10 Edges: 34 Directed: TRUE Edges: [0] 1 -> 0 [1] 2 -> 1 [2] 2 -> 0 [3] 3 -> 0 [4] 3 -> 1 [5] 4 -> 0 [6] 5 -> 0 [7] 5 -> 4 [8] 6 -> 1 [9] 6 -> 0 [10] 7 -> 0 [11] 8 -> 2 [12] 8 -> 1 [13] 8 -> 4 [14] 9 -> 0 [15] 9 -> 1 [16] 9 -> 6 [17] 0 -> 1 [18] 1 -> 2 [19] 0 -> 2 [20] 0 -> 3 [21] 1 -> 3 [22] 0 -> 4 [23] 0 -> 5 [24] 4 -> 5 [25] 1 -> 6 [26] 0 -> 6 [27] 0 -> 7 [28] 2 -> 8 [29] 1 -> 8 [30] 4 -> 8 [31] 0 -> 9 [32] 1 -> 9 [33] 6 -> 9

You can find the whole program creating the graph in file
`import.R`

.

Finally we can create a nice plot of the graph, with the edges connecting people working in the same room colored red.

`>`

E(g)$color <- "black"`>`

E(g)[ room=="y" ]$color <- "red"`>`

tkplot(g, layout=layout.kamada.kawai, edge.color=E(g)$color)

<< Regular Graphs |
Random Graphs >> |