Paragraph Alignment Step

From Okapi Framework
Jump to navigation Jump to search

Overview

This step aligns the paragraphs (TextUnits) from two documents.

Takes: Filter events. Sends: TextUnit Filter events only!

The ResourceSimplifier is called internally to flatten out events and expand TextUnits so that all TextUnits are availble for alignment and skeleton is availible for diffing if the option is enabled.

This step sends a special event (EventType.PIPELINE_PARAMETERS) that informs subsequent steps (e.g., Sentence Aligner) that the target input has been consumed and is no longer availible.

Parameters

Output 1-1 matches only — Set this option to output only 1-1 paragraph aligned matches.

Use Skeleton Alignment? (Experimental) — Set this option to use skeleton alignment to provide better alignment anchor points.

Limitations

Standard Gale and Church alignment is not accurate for long runs of paragraphs. The experimental option "Use Skeleton Alignment?" may improve alignments for some formats.