Transformations¶
This module offers a canonical interface with the aim to make re-use of transforming algorithms easier.
Let’s look at it with examples:
from delb.transform import Transformation
class ResolveCopyOf(Transformation):
def transform(self):
for node in self.root.css_select(
"*[copyOf]", namespaces={None: "http://www.tei-c.org/ns/1.0"}
):
source_id = node["copyOf"]
source_node = self.origin_document.xpath(
f'//*[@xml:id="{source_id[1:]}"]',
namespaces={}
).first
cloned_node = source_node.clone(deep=True)
cloned_node.id = None
node.replace_with(cloned_node)
From such defined transformations instances can be called with a (sub-)tree and an optional document where that tree originates from:
resolve_copy_of = ResolveCopyOf()
tree = resolve_copy_of(tree, origin_document=document)
typing.NamedTuple are used to define options for transformations:
from typing import Final, NamedTuple, TypedDict
class NamespacesKWArgs(TypedDict):
namespaces: dict[str | None, str]
TEI: Final[NamespacesKWArgs] = {"namespaces": {None: TEI_NAMESPACE}}
class ResolveChoiceOptions(NamedTuple):
corr: bool = True
reg: bool = True
class ResolveChoice(Transformation):
options_class = ResolveChoiceOptions
def __init__(self, options):
super().__init__(options)
self.keep_selector = ",".join(
(
"corr" if self.options.corr else "sic",
"reg" if self.options.reg else "orig"
)
)
self.drop_selector = ",".join(
(
"sic" if self.options.corr else "corr",
"orig" if self.options.reg else "reg"
)
)
def transform(self):
for choice_node in self.root.css_select("choice", **TEI):
node_to_drop = choice_node.css_select(self.drop_selector, **TEI).first
node_to_drop.detach()
node_to_keep = choice_node.css_select(self.keep_selector, **TEI).first
node_to_keep.detach(retain_child_nodes=True)
choice_node.detach(retain_child_nodes=True)
A transformation class that defines an option_class property can then either be used
with its defaults or with alternate options:
resolve_choice = ResolveChoice()
tree = resolve_choice(tree)
resolve_choice = ResolveChoice(ResolveChoiceOptions(reg=False))
tree = resolve_choice(tree)
Finally, concrete transformations can be chained, both as classes or instances. The interface allows also to chain multiple chains:
from delb.transform import TransformationSequence
tidy_up = TransformationSequence(ResolveCopyOf, resolve_choice)
tree = tidy_up(tree)
Attention
This is an experimental feature. It might change significantly in the future or be removed altogether.
- class delb.transform.Transformation(options: NamedTuple | None = None)[source]¶
This is a base class for any transformation algorithm.
- abstract transform()[source]¶
This method needs to implement the transformation logic. When it is called, the instance has two attributes assigned from its call:
rootis the node that the transformation was called to transform with.origin_documentis the document that was possibly passed as second argument.
- class delb.transform.TransformationSequence(*transformations: TransformationBase | type[TransformationBase])[source]¶
A transformation sequence can be used to combine any number of both
Transformation(provided as class or instantiated with options) and otherTransformationSequenceinstances or classes.