Class RepetitionAnalysisStep

  • All Implemented Interfaces:
    AutoCloseable, Function<Stream<Event>,​Stream<Event>>, IPipelineStep

    public class RepetitionAnalysisStep
    extends BasePipelineStep
    The step analyzes repetitions in input documents. Either exact or configurable fuzzy search is performed.

    2 types of annotations are created for found repetitive segments -- RepetitiveSegmentAnnotation and AltTranslationsAnnotation. RepetitiveSegmentAnnotation's are attached to all repetitive source segments. AltTranslationsAnnotation's are attached to target segments, corresponding to repetitive source segments. AltTranslationsAnnotation is not attached for the first repetitive segment not to be counted by counting steps twice as repetitive with itself.