|
| 1 | +# Semi-supervised Learning with Graph Convolution Networks (GCN) |
| 2 | + |
| 3 | +Graph convolution networks (GCN) have been considered as the first step to graph neural networks (GNN). This example will go through how to train a vanilla GCN. |
| 4 | + |
| 5 | +## Semi-supervised Learning in Graph Neural Networks |
| 6 | + |
| 7 | +The semi-supervised learning task defines a learning by given features and labels for only partial nodes in a graph. We train features and labels for partial nodes, and test the model for another partial nodes in graph. |
| 8 | + |
| 9 | +## Node Classification task |
| 10 | + |
| 11 | +In this task, we learn a node classification task which learns a model to predict labels for each node in a graph. In GCN network, node features are given and the model outputs node labels. |
| 12 | + |
| 13 | +## Step 1: Load Dataset |
| 14 | + |
| 15 | +GeometricFlux provides planetoid dataset in `GeometricFlux.Datasets`, which is provided by GraphMLDatasets. Planetoid dataset has three sub-datasets: Cora, Citeseer, PubMed. We demonstrate Cora dataset in this example. `traindata` provides the functionality for loading training data from various kinds of datasets. Dataset can be specified by the first argument, and the second for sub-datasets. |
| 16 | + |
| 17 | +```julia |
| 18 | +using GeometricFlux.Datasets |
| 19 | + |
| 20 | +train_X, train_y = traindata(Planetoid(), :cora) |
| 21 | +``` |
| 22 | + |
| 23 | +`traindata` returns a pre-defined training features and labels. These features are node features. |
| 24 | + |
| 25 | +```julia |
| 26 | +train_X, train_y = map(x->Matrix(x), traindata(Planetoid(), :cora)) |
| 27 | +``` |
| 28 | + |
| 29 | +We can load graph from `graphdata`, and the graph is preprocessed into `SimpleGraph` type, which is provided by Graphs. |
| 30 | + |
| 31 | +```julia |
| 32 | +g = graphdata(Planetoid(), :cora) |
| 33 | +train_idx = train_indices(Planetoid(), :cora) |
| 34 | +``` |
| 35 | + |
| 36 | +We need node indices to index a subgraph from original graph. `train_indices` gives node indices for training. |
| 37 | + |
| 38 | +## Step 2: Wrapping Graph and Features into `FeaturedGraph` |
| 39 | + |
| 40 | +`FeaturedGraph` is a container for holding a graph, node features, edge features and global features. It is provided by GraphSignals. To wrap graph and node features into `FeaturedGraph`, graph `g` should be placed as the first argument and `nf` is to specify node features. |
| 41 | + |
| 42 | +```julia |
| 43 | +using GraphSignals |
| 44 | + |
| 45 | +FeaturedGraph(g, nf=train_X) |
| 46 | +``` |
| 47 | + |
| 48 | +If we want to get a subgraph from a `FeaturedGraph` object, we call `subgraph` and provide node indices `train_idx` as second argument. |
| 49 | + |
| 50 | +```julia |
| 51 | +subgraph(FeaturedGraph(g, nf=train_X), train_idx) |
| 52 | +``` |
| 53 | + |
| 54 | +## Step 3: Build a GCN model |
| 55 | + |
| 56 | +A GCn model is composed of two layers of `GCNConv` and the activation function for first layer is `relu`. In the middle, a `Dropout` layer is placed. We need a `GraphParallel` to integrate with regular Flux layer, and it specifies node features go to `node_layer=Dropout(0.5)`. |
| 57 | + |
| 58 | +```julia |
| 59 | +model = Chain( |
| 60 | + GCNConv(input_dim=>hidden_dim, relu), |
| 61 | + GraphParallel(node_layer=Dropout(0.5)), |
| 62 | + GCNConv(hidden_dim=>target_dim), |
| 63 | + node_feature, |
| 64 | +) |
| 65 | +``` |
| 66 | + |
| 67 | +Since the model input is a `FeaturedGraph` object, the model output a `FeaturedGraph` object as well. In the end of model, we get node features out from a `FeaturedGraph` object using `node_feature`. |
| 68 | + |
| 69 | +## Step 4: Loss Functions and Accuracy |
| 70 | + |
| 71 | +Then, since it is a node classification task, we define the model loss by `logitcrossentropy`, and a L2 regularization is used. In the vanilla GCN, only first layer is applied to L2 regularization and can be adjusted by hyperparameter `λ`. |
| 72 | + |
| 73 | +```julia |
| 74 | +l2norm(x) = sum(abs2, x) |
| 75 | + |
| 76 | +function model_loss(model, λ, batch) |
| 77 | + loss = 0.f0 |
| 78 | + for (x, y) in batch |
| 79 | + loss += logitcrossentropy(model(x), y) |
| 80 | + loss += λ*sum(l2norm, Flux.params(model[1])) |
| 81 | + end |
| 82 | + return loss |
| 83 | +end |
| 84 | +``` |
| 85 | + |
| 86 | +Accuracy for a batch and for data loader are provided. |
| 87 | + |
| 88 | +```julia |
| 89 | +function accuracy(model, batch::AbstractVector) |
| 90 | + return mean(mean(onecold(softmax(cpu(model(x)))) .== onecold(cpu(y))) for (x, y) in batch) |
| 91 | +end |
| 92 | + |
| 93 | +accuracy(model, loader::DataLoader, device) = mean(accuracy(model, batch |> device) for batch in loader) |
| 94 | +``` |
| 95 | + |
| 96 | +## Step 5: Training GCN Model |
| 97 | + |
| 98 | +We train the model with the same process as training a Flux model. |
| 99 | + |
| 100 | +```julia |
| 101 | +train_loader, test_loader = load_data(:cora, args.batch_size) |
| 102 | + |
| 103 | +# optimizer |
| 104 | +opt = ADAM(args.η) |
| 105 | + |
| 106 | +# parameters |
| 107 | +ps = Flux.params(model) |
| 108 | + |
| 109 | +# training |
| 110 | +train_steps = 0 |
| 111 | +@info "Start Training, total $(args.epochs) epochs" |
| 112 | +for epoch = 1:args.epochs |
| 113 | + @info "Epoch $(epoch)" |
| 114 | + |
| 115 | + for batch in train_loader |
| 116 | + grad = gradient(() -> model_loss(model, args.λ, batch |> device), ps) |
| 117 | + Flux.Optimise.update!(opt, ps, grad) |
| 118 | + train_steps += 1 |
| 119 | + end |
| 120 | +end |
| 121 | +``` |
| 122 | + |
| 123 | +So far, we complete a basic tutorial for training a GCN model! |
| 124 | + |
| 125 | +For the complete example, please check the script `examples/semisupervised_gcn.jl`. |
| 126 | + |
| 127 | +## Acceleration by Pre-computing Normalized Adjacency Matrix |
| 128 | + |
| 129 | +The training process can be slow in this example. Since we place the graph and features together in `FeaturedGraph` object, `GCNConv` will need to compute a normalized adjacency matrix in the training process. This behavior will lead to long training time. We can accelerate training process by pre-compute normalized adjacency matrix for all `FeaturedGraph` objects. To do so, we can call the following function and it will compute normalized adjacency matrix for `fg` before training. This will reduce the training time. |
| 130 | + |
| 131 | +```julia |
| 132 | +GraphSignals.normalized_adjacency_matrix!(fg) |
| 133 | +``` |
| 134 | + |
| 135 | +Since the normalized adjacency matrix is used in `GCNConv`, we could pre-compute normalized adjacency matrix for it. If a layer doesn't require a normalized adjacency matrix, this step will lead to error. |
0 commit comments