run
- run(name='Cora', data_root='/tmp', no_features=False, model='VGAE', num_epochs=10000, patience=20, runs=10, cl_runs=5, dims: List[int] | None = None, hidden_multiplier=2, target_patch_degree=4.0, min_overlap: int | None = None, target_overlap: int | None = None, gamma=0.0, sparsify='resistance', train_directed=False, cluster='metis', num_clusters=10, beta=0.1, num_iters: int | None = None, lr=0.001, cl_model='logistic', cl_train_args={}, cl_model_args={}, dist=False, output='.', device: str | None = None, verbose_train=False, verbose_l2g=False, levels=1, resparsify=0, run_baseline=True, normalise=False, restrict_lcc=False, scale=False, rotate=True, translate=True, mmap_edges=False, mmap_features=False, random_split=True, use_tmp=False, cluster_init=False, use_gpu_frac=1.0, grid_search_params=True, progress_bars=True)[source]
Run training example.
By default this function writes results to the current working directory. To override this use the
output
argument.- Parameters:
name – Name of data set to load (one of {
'Cora'
,'PubMed'
,'AMZ_computers'
,'AMZ_photo'
})data_root – Directory to use for downloaded data
no_features – If
True
, discard features and use node identity.model – embedding model type (one of {‘VGAE’, ‘GAE’, ‘DGI’})
num_epochs – Number of training epochs
patience – Patience for early stopping
runs – Number of training runs (keep best result)
dims – list of embedding dimensions (default:
[2]
)hidden_multiplier – Hidden dimension is
hidden_multiplier * dim
target_patch_degree – Target patch degree for resistance sparsification.
min_overlap – Minimum target patch overlap (default:
max(dims) + 1
)target_overlap – Target patch overlap (default:
2 * max(dims)
)gamma – Value of ‘gamma’ for RMST sparsification
sparsify – Sparsification method to use (one of {
'resistance'
,'none'
,'rmst'
})train_directed – Use the orignal directed network (only relevant for some loaders) (default:
False
)cluster – Clustering method to use (one of {
'louvain'
,'fennel'
,'distributed'
,'metis'
})num_clusters – Target number of clusters for distributed, fennel, or metis.
beta – Parameter for the distributed clustering algorithm
num_iters – Maximum iterations for distributed or fennel
lr – Learning rate
cl_model – the classification model to use (one of “logistic” or “mlp”) (default: “logistic”)
cl_train_args – extra arguments to pass down to the classification training (default: {})
cl_model_args – extra arguments to pass to the classfication model constructor (default: {})
dist – If
True
, use distance decoder instead of inner product decoderverbose_l2g – Verbose output for the alignment step (default:
False
)normalise – If True, normalise the dataset features (default:
False
)restrict_lcc – If True, restrict the dataset to only consider largest connected component (default:
False
)output – output folder
levels – number of hierarchical patch levels (default: 1)
resparsify – if > 0, use resistance sparsification for all levels of th hierarchy (default: 0)
scale – apply scaling transformations during alignment (default: False)
rotate – apply rotations during alignment (default: True)
translate – apply translations during alignment (default: True)
mmap_edges – use memory mapping for edges (only supported by some loaders) (default: False)
mmap_features – use memory mapping for features (only supported by some loaders) (default: False)
random_split – use random train-test splits for evaluation (default: True)
use_tmp – copy data to tmp dir during load (default: False)
cluster_init – run the cluster initialisation script (default: False)
use_gpu_frac – fraction of gpu to use by each worker (default: 1.0)
grid_search_params – use grid search for classification parameters (only for cl_model=’mlp’)
device – Device used for training e.g., ‘cpu’, ‘cuda’ (defaults to ‘cuda’ if available else ‘cpu’)
verbose_train – If
True
, show progress inforun_baseline – if
True
, run baseline full modelprogress_bars – show progress bars (default: True)
This function is also exposed as a command-line interface.
References