Emgraph (Embedding graphs) is a Python library for graph representation learning.
It provides a simple API for design, train, and evaluate graph embedding models. You can use the base models to easily develop your own model.
Embedding wordnet11 graph using
TransE
model:
from sklearn.metrics import brier_score_loss, log_loss
from scipy.special import expit
from emgraph.datasets import BaseDataset, DatasetType
from emgraph.models import TransE
def train_transe(data):
model = TransE(batches_count=64, seed=0, epochs=20, k=100, eta=20,
optimizer='adam', optimizer_params={'lr': 0.0001},
loss='pairwise', verbose=True, large_graphs=False)
model.fit(data['train'])
scores = model.predict(data['test'])
return scores
if __name__ == '__main__':
wn11_dataset = BaseDataset.load_dataset(DatasetType.WN11)
scores = train_transe(data=wn11_dataset)
print("Scores: ", scores)
print("Brier score loss:", brier_score_loss(wn11_dataset['test_labels'], expit(scores)))
Evaluating ComplEx
model after training:
import numpy as np
from emgraph.datasets import BaseDataset, DatasetType
from emgraph.models import ComplEx
from emgraph.evaluation import evaluate_performance
def complex_performance(data):
model = ComplEx(batches_count=10, seed=0, epochs=20, k=150, eta=1,
loss='nll', optimizer='adam')
model.fit(np.concatenate((data['train'], data['valid'])))
filter_triples = np.concatenate((data['train'], data['valid'], data['test']))
ranks = evaluate_performance(data['test'][:5], model=model,
filter_triples=filter_triples,
corrupt_side='s+o',
use_default_protocol=False)
return ranks
if __name__ == '__main__':
wn18_dataset = BaseDataset.load_dataset(DatasetType.WN18)
ranks = complex_performance(data=wn18_dataset)
print("ranks {}".format(ranks))
Embedding wordnet11 graph using
DistMult
model:
from sklearn.metrics import brier_score_loss, log_loss
from scipy.special import expit
from emgraph.datasets import BaseDataset, DatasetType
from emgraph.models import DistMult
def train_dist_mult(data):
model = DistMult(batches_count=1, seed=555, epochs=20, k=10, loss='pairwise',
loss_params={'margin': 5})
model.fit(data['train'])
scores = model.predict(data['test'])
return scores
if __name__ == '__main__':
wn11_dataset = BaseDataset.load_dataset(DatasetType.WN11)
scores = train_dist_mult(data=wn11_dataset)
print("Scores: ", scores)
print("Brier score loss:", brier_score_loss(wn11_dataset['test_labels'], expit(scores)))
The Emgraph project welcomes your expertise and enthusiasm!
Ways to contribute to Emgraph:
- Writing code
- Review pull requests
- Develop tutorials, presentations, and other educational materials
- Translate documentation and readme contents
If you happened to encounter any issue in the codes, please report it here. A better way is to fork the repository on Github and/or create a pull request.
- Support CPU/GPU
- Vectorized operations
- Preprocessors
- Dataset loader
- Standard API
- Documentation
- Test driven development
This repository is a transformation of the AmpliGraph library for TensorFlow 2, with a modular architecture implementation. It also draws inspiration from PyKEEN and Spectral. Credit is extended to these exceptional projects.
Copyright © 2019-2024 Emgraph Developers Soran Ghaderi ([email protected]) follow me Taleb Zarhesh ([email protected]) follow me