Repetition Analysis Step

From Okapi Framework
Revision as of 06:02, 13 May 2011 by Ysavourel (talk | contribs) (→‎Overview)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Overview

This step analyzes repetitions in input documents. Either exact or configurable fuzzy search is performed.

Takes: Filter events. Sends: Filter events.

Two types of annotations are created for found repetitive segments -- RepetitiveSegmentAnnotation and AltTranslationsAnnotation.

  • RepetitiveSegmentAnnotation is attached to a repetitive source segment.
  • AltTranslationsAnnotation is attached to the target segment, corresponding to a repetitive source segment. AltTranslationsAnnotation is not attached for the first repetitive segment in its group not to be counted by counting steps twice as repetitive with itself.

Parameters

Fuzzy threshold — Fuzzy threshold for fuzzy repetitions. Leave 100 for exact repetitions only.

Limitations

None known.