deepblocker
DeepBlocker
Bases: EmbeddingBlocker
Base class for DeepBlocker strategies.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
frame_encoder |
HintOrType[DeepBlockerFrameEncoder]
|
DeepBlockerFrameEncoder: DeepBlocker strategy. |
None
|
frame_encoder_kwargs |
OptionalKwargs
|
keyword arguments for initialisation of encoder |
None
|
embedding_block_builder_kwargs |
OptionalKwargs
|
keyword arguments for initalising blockbuilder. |
None
|
save |
bool
|
If true saves the embeddings before using blockbuilding. |
True
|
save_dir |
Optional[Union[str, Path]]
|
Directory where to save the embeddings. |
None
|
force |
bool
|
If true, recalculate the embeddings and overwrite existing. Else use precalculated if present. |
False
|
Attributes:
Name | Type | Description |
---|---|---|
frame_encoder |
DeepBlocker Encoder class to use for embedding the datasets. |
|
embedding_block_builder |
Block building class to create blocks from embeddings. |
|
save |
If true saves the embeddings before using blockbuilding. |
|
save_dir |
Directory where to save the embeddings. |
|
force |
If true, recalculate the embeddings and overwrite existing. Else use precalculated if present. |
Examples:
>>> # doctest: +SKIP
>>> from sylloge import MovieGraphBenchmark
>>> from klinker.data import KlinkerDataset
>>> ds = KlinkerDataset.from_sylloge(MovieGraphBenchmark(),clean=True)
>>> from klinker.blockers import DeepBlocker
>>> blocker = DeepBlocker(frame_encoder="autoencoder")
>>> blocks = blocker.assign(left=ds.left, right=ds.right)
Reference
Thirumuruganathan et. al. 'Deep Learning for Blocking in Entity Matching: A Design Space Exploration', VLDB 2021, http://vldb.org/pvldb/vol14/p2459-thirumuruganathan.pdf
Source code in klinker/blockers/embedding/deepblocker.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 |
|