Class TextContainer
- java.lang.Object
-
- net.sf.okapi.common.resource.BaseNameable
-
- net.sf.okapi.common.resource.TextContainer
-
- All Implemented Interfaces:
Cloneable
,Iterable<TextPart>
,IResource
,INameable
,IWithAnnotations
,IWithProperties
,IWithSkeleton
public class TextContainer extends BaseNameable implements Iterable<TextPart>
Provides methods for storing the content of a paragraph-type unit, to handle its properties, annotations and segmentation.The TextContainer is made of a collection of parts: Some are simple
TextPart
objects, others are specialTextPart
objects calledSegment
.A TextContainer has always at least one
Segment
part.
-
-
Field Summary
-
Fields inherited from class net.sf.okapi.common.resource.BaseNameable
id, isTranslatable, mimeType, name, preserveWS, type
-
Fields inherited from interface net.sf.okapi.common.IResource
COPY_ALL, COPY_CONTENT, COPY_PROPERTIES, COPY_SEGMENTATION, COPY_SEGMENTED_CONTENT, CREATE_EMPTY
-
-
Constructor Summary
Constructors Constructor Description TextContainer()
Creates a new empty TextContainer object.TextContainer(String text)
Creates a new TextContainer object with some initial text.TextContainer(Segment segment)
Creates a new TextContainer object with an initial segment.TextContainer(TextFragment fragment)
Creates a new TextContainer object with an initial TextFragment.TextContainer(TextPart... parts)
Creates a new TextContainer object with initialTextPart
s (segment or non-segment) appended.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description void
append(String text)
Appends a part with a given text at the end of this container.void
append(String text, boolean collapseIfPreviousEmpty)
Appends a part with a given text at the end of this container.void
append(TextFragment fragment)
Appends a part at the end of this container.void
append(TextFragment fragment, boolean collapseIfPreviousEmpty)
Appends a part at the end of this container.void
append(TextFragment fragment, boolean collapseIfPreviousEmpty, boolean keepCodeIds)
Appends a part at the end of this container.void
append(TextPart part)
Appends aTextPart
(segment or non-segment) at the end of this container.void
append(TextPart part, boolean collapseIfPreviousEmpty)
Appends aTextPart
(segment or non-segment) at the end of this container.void
changePart(int partIndex)
Changes the type of a given part.void
clear()
Clears this TextContainer, removes any existing segments.TextContainer
clone()
Clones this TextContainer, including the properties.TextContainer
clone(boolean cloneProperties)
Clones this container, with or without its properties.int
compareTo(TextContainer cont, TextFragment.CompareMode compareMode)
Compares this container with another one.boolean
contentIsOneSegment()
Indicates if this container is made of a single segment that holds the whole content (i.e.static String[]
contentToSplitStorage(TextContainer tc)
Create two storage strings to serialize a givenTextContainer
.static String
contentToString(TextContainer tc)
Creates a string that stores the content of a given container.int
count()
Gets the number of parts (segments and non-segments) in this container.TextFragment
createJoinedContent()
TextFragment
createJoinedContent(boolean keepCodeIds)
TextPart
get(int index)
Gets the part (segment or non-segment) for a given part index.String
getCodedText()
Gets the coded text of the whole content (segmented or not).String
getCodedText(boolean keepCodeIds)
Gets the coded text of the whole content (segmented or not).TextFragment
getFirstContent()
Gets the content of the first part (segment or non-segment) of this container.Segment
getFirstSegment()
Returns the firstSegment
of this container.TextFragment
getLastContent()
Gets the content of the last part (segment or non-segment) of this container.List<TextPart>
getParts()
ISegments
getSegments()
Creates a newISegments
object to access the segments of this container.TextFragment
getUnSegmentedContentCopy()
Gets a new TextFragment representing the un-segmented content of this container.TextFragment
getUnSegmentedContentCopy(boolean keepCodeIds)
Gets a new TextFragment representing the un-segmented content of this container.boolean
hasBeenSegmented()
Indicates if a segmentation has been applied to this container.boolean
hasCode()
Indicates if this container hasCode
s.boolean
hasText()
Indicates if this fragment contains at least one character that is 'text' (inline codes, segment markers, and annotation markers do not count as 'text' characters).boolean
hasText(boolean whiteSpacesAreText)
Indicates if this container contains at least one character that is not a whitespace.boolean
hasText(boolean lookInSegments, boolean whiteSpacesAreText)
Indicates if this container contains at least one character.void
insert(int partIndex, TextPart part)
Inserts a given part (segment or non-segment) at a given position.boolean
isEmpty()
Indicates if this container is empty (no text and no codes).Iterator<TextPart>
iterator()
Creates an iterator to loop through the parts (segments and non-segments) of this container.void
joinAll()
Merges back together all parts (segments and non-segments) of this container, and clear the list of segments.int
joinWithNext(int partIndex, int partCount)
Joins a given part with a specified number of its following parts.void
remove(int partIndex)
Removes the part at s given position.void
setContent(TextFragment content)
Sets the content of this TextContainer.TextContainer
setContentFromString(String data)
Sets content of this TextContainer from a string created bycontentToString(TextContainer)
.void
setHasBeenSegmentedFlag(boolean hasBeenSegmented)
Sets the flag indicating if the content of this container has been segmented.void
setParts(TextPart... parts)
void
split(int partIndex, int start, int end, boolean spannedPartIsSegment)
Splits a given part into two or three parts.static TextContainer
splitStorageToContent(String ctext, String codes)
Creates a newTextContainer
object from two strings generated withcontentToSplitStorage(TextContainer)
.static TextContainer
stringToContent(String data)
Converts a string created bycontentToString(TextContainer)
back into a TextContainer.String
toString()
Gets the string representation of this container.void
unwrap(boolean trimEnds, boolean collapseMode)
Unwraps the content of this container.-
Methods inherited from class net.sf.okapi.common.resource.BaseNameable
getAnnotation, getAnnotations, getId, getMimeType, getName, getProperties, getProperty, getPropertyNames, getSkeleton, getType, hasProperty, isTranslatable, preserveWhitespaces, removeProperty, setAnnotation, setId, setIsTranslatable, setMimeType, setName, setPreserveWhitespaces, setProperty, setSkeleton, setType
-
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
-
Methods inherited from interface java.lang.Iterable
forEach, spliterator
-
Methods inherited from interface net.sf.okapi.common.resource.IWithAnnotations
annotationIterator, getAnnotationsTypesAsSet, hasAnnotation, hasAnnotations, remove
-
Methods inherited from interface net.sf.okapi.common.resource.IWithProperties
propertyIterator
-
-
-
-
Constructor Detail
-
TextContainer
public TextContainer()
Creates a new empty TextContainer object.
-
TextContainer
public TextContainer(String text)
Creates a new TextContainer object with some initial text.- Parameters:
text
- the initial text.
-
TextContainer
public TextContainer(TextFragment fragment)
Creates a new TextContainer object with an initial TextFragment.- Parameters:
fragment
- the initial TextFragment.
-
TextContainer
public TextContainer(TextPart... parts)
Creates a new TextContainer object with initialTextPart
s (segment or non-segment) appended.- Parameters:
parts
- the given initial parts.
-
TextContainer
public TextContainer(Segment segment)
Creates a new TextContainer object with an initial segment. If the id of the segment is null it will be set automatically.- Parameters:
segment
- the initial segment.
-
-
Method Detail
-
getSegments
public ISegments getSegments()
Creates a newISegments
object to access the segments of this container.- Returns:
- a new
ISegments
object.
-
contentToString
public static String contentToString(TextContainer tc)
Creates a string that stores the content of a given container. UsestringToContent(String)
to create the container back from the string.IMPORTANT: Only the content is saved (not the properties, annotations, etc.).
- Parameters:
tc
- the container holding the content to store.- Returns:
- a string representing the content of the given container.
-
stringToContent
public static TextContainer stringToContent(String data)
Converts a string created bycontentToString(TextContainer)
back into a TextContainer.- Parameters:
data
- the string to process.- Returns:
- a new TextConatiner with the stored content re-created.
-
setContentFromString
public TextContainer setContentFromString(String data)
Sets content of this TextContainer from a string created bycontentToString(TextContainer)
.- Parameters:
data
- the string to process.- Returns:
- this TextConatiner.
-
contentToSplitStorage
public static String[] contentToSplitStorage(TextContainer tc)
Create two storage strings to serialize a givenTextContainer
. UsesplitStorageToContent(String, String)
to create the container back from the strings.IMPORTANT: Only the content is saved (not the properties, annotations, etc.).
- Parameters:
tc
- the text container to store.- Returns:
- An array of two
String
objects: The first one contains the coded text parts, the second one contains the codes. - See Also:
splitStorageToContent(String, String)
-
splitStorageToContent
public static TextContainer splitStorageToContent(String ctext, String codes)
Creates a newTextContainer
object from two strings generated withcontentToSplitStorage(TextContainer)
.- Parameters:
ctext
- the string holding the coded text parts.codes
- the string holding the codes.- Returns:
- a new
TextContainer
object created from the strings. - See Also:
contentToSplitStorage(TextContainer)
-
toString
public String toString()
Gets the string representation of this container. If the container is segmented, the representation shows the merged segments. Inline codes are also included.- Overrides:
toString
in classBaseNameable
- Returns:
- the string representation of this container.
-
iterator
public Iterator<TextPart> iterator()
Creates an iterator to loop through the parts (segments and non-segments) of this container.
-
compareTo
public int compareTo(TextContainer cont, TextFragment.CompareMode compareMode)
Compares this container with another one. Note: This is a costly operation if the two containers have segments and no text differences.- Parameters:
cont
- the other container to compare this one with.compareMode
-TextFragment.CompareMode
- Returns:
- a value 0 if the objects are equals.
-
hasBeenSegmented
public boolean hasBeenSegmented()
Indicates if a segmentation has been applied to this container. Note that it does not mean there is more than one segment or one part. UsecontentIsOneSegment()
to check if the container counts only one segment (whether is is the result of a segmentation or simply the default single segment).This method return true if any method that may cause the content to be segmented has been called, and no operation has resulted in un-segmenting the content since that call, or if the content has more than one part.
- Returns:
- true if a segmentation has been applied to this container.
- See Also:
setHasBeenSegmentedFlag(boolean)
-
setHasBeenSegmentedFlag
public void setHasBeenSegmentedFlag(boolean hasBeenSegmented)
Sets the flag indicating if the content of this container has been segmented.- Parameters:
hasBeenSegmented
- true to flag the content has having been segmented, false to set it has not having been segmented.- See Also:
hasBeenSegmented()
-
contentIsOneSegment
public boolean contentIsOneSegment()
Indicates if this container is made of a single segment that holds the whole content (i.e. there is no other parts).When this method returns true, the methods
getFirstContent()
,ISegments.getFirstContent()
,getLastContent()
andISegments.getLastContent()
return the same result.- Returns:
- true if the whole content of this container is in a single segment.
- See Also:
count()
,ISegments.count()
-
changePart
public void changePart(int partIndex)
Changes the type of a given part. If the part was a segment this makes it a non-segment (except if this is the only part in the content. In that case the part remains unchanged). If this part was not a segment this makes it a segment (with its identifier automatically set).- Parameters:
partIndex
- the index of the part to change. Note that even if the part is a segment this index must be the part index not the segment index.
-
insert
public void insert(int partIndex, TextPart part)
Inserts a given part (segment or non-segment) at a given position. If the position is already occupied that part and all the parts to it right are shifted to the right.If the part to insert is a segment, its id is validated.
- Parameters:
partIndex
- the position where to insert the new part.part
- the part to insert.
-
remove
public void remove(int partIndex)
Removes the part at s given position.If the selected part is the last segment in the content, the part is only cleared, not removed.
- Parameters:
partIndex
- the position of the part to remove.
-
append
public void append(TextFragment fragment, boolean collapseIfPreviousEmpty)
Appends a part at the end of this container.If collapseIfPreviousEmpty and if the current last part (segment or non-segment) is empty, the text fragment is appended to the last part. Otherwise the text fragment is appended to the content as a new non-segment part.
Important: If the container is empty, the appended part becomes a segment, as the container has always at least one segment.
- Parameters:
fragment
- the text fragment to append.collapseIfPreviousEmpty
- true to collapse the previous part if it is empty.
-
append
public void append(TextFragment fragment, boolean collapseIfPreviousEmpty, boolean keepCodeIds)
Appends a part at the end of this container.If collapseIfPreviousEmpty and if the current last part (segment or non-segment) is empty, the text fragment is appended to the last part. Otherwise the text fragment is appended to the content as a new non-segment part.
Important: If the container is empty, the appended part becomes a segment, as the container has always at least one segment.
- Parameters:
fragment
- the text fragment to append.collapseIfPreviousEmpty
- true to collapse the previous part if it is empty.keepCodeIds
- true to block code balancing.
-
append
public void append(TextFragment fragment)
Appends a part at the end of this container.This call is the same as calling
append(TextFragment, boolean)
with collapseIfPreviousEmpty set to true.- Parameters:
fragment
- the text fragment to append.
-
append
public void append(String text, boolean collapseIfPreviousEmpty)
Appends a part with a given text at the end of this container.If collapseIfPreviousEmpty is true and if the current last part (segment or non-segment) is empty, the new text is appended to the last part part. Otherwise the text is appended to the content as a new non-segment part.
- Parameters:
text
- the text to append.collapseIfPreviousEmpty
- true to collapse the previous part if it is empty.
-
append
public void append(String text)
Appends a part with a given text at the end of this container.This call is the same as calling
append(String, boolean)
with collapseIfPreviousEmpty set to true.- Parameters:
text
- the text to append.
-
append
public void append(TextPart part, boolean collapseIfPreviousEmpty)
Appends aTextPart
(segment or non-segment) at the end of this container.If collapseiIfPreviousEmpty is true and if the current last part (segment or non-segment) is empty, the new part replaces the last part. Otherwise the part is appended to the container as it. If the result of the operation would result in a container without segment, the first part is automatically converted to a fragment.
- Parameters:
part
- the TextPart to append.collapseIfPreviousEmpty
- true to collapse the previous part if it is empty.
-
append
public void append(TextPart part)
Appends aTextPart
(segment or non-segment) at the end of this container.This call is the same as calling
append(TextPart, boolean)
with collapseIfPreviousEmpty set to true.- Parameters:
part
- the TextPart to append.
-
getCodedText
public String getCodedText(boolean keepCodeIds)
Gets the coded text of the whole content (segmented or not). Use this method to compute segment boundaries that will be applied usingISegments.create(int, int)
orISegments.create(List)
or other methods.- Parameters:
keepCodeIds
- if true then keep the id of the originalCode
- Returns:
- the coded text of the whole content to use for segmentation template.
- See Also:
ISegments.create(int, int)
,ISegments.create(List)
-
getCodedText
public String getCodedText()
Gets the coded text of the whole content (segmented or not). Use this method to compute segment boundaries that will be applied usingISegments.create(int, int)
orISegments.create(List)
or other methods.- Returns:
- the coded text of the whole content to use for segmentation template.
- See Also:
ISegments.create(int, int)
,ISegments.create(List)
-
split
public void split(int partIndex, int start, int end, boolean spannedPartIsSegment)
Splits a given part into two or three parts.- If end == start or end or -1 : A new part is created on the right side of the position. It has the same type as the original part.
- If start == 0: A new part is created on the left side of the original part.
- If the specified span is empty at either end of the part, or if it is equals to the whole length of the part: No change (it would result in an empty part). It has the type specified by spannedPartIsSegment.
- Parameters:
partIndex
- index of the part to split.start
- start of the middle part to create.end
- position just after the last character of the middle part to create.spannedPartIsSegment
- true if the new middle part should be a segment, false if it should be a non-segment.
-
unwrap
public void unwrap(boolean trimEnds, boolean collapseMode)
Unwraps the content of this container.This method replaces any sequences of white-spaces by a single space character. It also removes leading and trailing white-spaces if the parameter trimEnds is set to true.
White spaces in this context are #x9, #xA and #x20. #xD is not considered a whitespace as the content of a text container must have its line-breaks normalized to #xA.
If the container has more than one segment and if collapseMode mode is set: non-segments parts are normalized and removed if they end up empty. If the option is not set: the method preserve at least one space between segments, even if the segments are empty.
Empty segments are always left.
Currently there is no provision to not unwrap a given span of the content.
- Parameters:
trimEnds
- true to remove leading and trailing white-spaces.collapseMode
- true to remove non-segments parts that end up empty after the unwrapping.
-
getFirstContent
public TextFragment getFirstContent()
Gets the content of the first part (segment or non-segment) of this container.This method always returns the same result as
ISegments.getFirstContent()
ifcontentIsOneSegment()
is true.- Returns:
- the content of the first part (segment or non-segment) of this container.
- See Also:
ISegments.getFirstContent()
,getLastContent()
,ISegments.getLastContent()
-
getLastContent
public TextFragment getLastContent()
Gets the content of the last part (segment or non-segment) of this container.This method always returns the same result as
ISegments.getLastContent()
ifcontentIsOneSegment()
.- Returns:
- the content of the last part (segment or non-segment) of this container.
- See Also:
ISegments.getLastContent()
,getFirstContent()
,ISegments.getFirstContent()
-
clone
public TextContainer clone()
Clones this TextContainer, including the properties.
-
clone
public TextContainer clone(boolean cloneProperties)
Clones this container, with or without its properties.- Parameters:
cloneProperties
- indicates if the properties should be cloned.- Returns:
- A new TextContainer object that is a copy of this one.
-
getUnSegmentedContentCopy
public TextFragment getUnSegmentedContentCopy()
Gets a new TextFragment representing the un-segmented content of this container.Important: This is an expensive method.
- Returns:
- an un-segmented copy of the content of this container.
-
getUnSegmentedContentCopy
public TextFragment getUnSegmentedContentCopy(boolean keepCodeIds)
Gets a new TextFragment representing the un-segmented content of this container.Important: This is an expensive method.
- Returns:
- an un-segmented copy of the content of this container.
-
setContent
public void setContent(TextFragment content)
Sets the content of this TextContainer. Any existing segmentation is removed. The content becomes a single segment content.- Parameters:
content
- the new content to set.
-
setParts
public void setParts(TextPart... parts)
-
clear
public void clear()
Clears this TextContainer, removes any existing segments. The content becomes a single empty segment content. Keeps annotations.- Specified by:
clear
in interfaceIWithAnnotations
-
hasText
public boolean hasText(boolean lookInSegments, boolean whiteSpacesAreText)
Indicates if this container contains at least one character. Inline codes and annotation markers do not count as characters.- If the whole content is a single segment the check is performed on that content and the option lookInSegments is ignored.
- If the content has several segments or if the single segment is not the whole content, each segment is checked only if lookInSegment is set.
- The holder is always checked if no text is found in the segments.
- Parameters:
lookInSegments
- indicates if the possible segments in this containers should be looked at. If this parameter is set to false, the segment marker are treated as codes.whiteSpacesAreText
- indicates if whitespaces should be considered text characters or not.- Returns:
- true if this container contains at least one character according the given options.
-
hasText
public boolean hasText(boolean whiteSpacesAreText)
Indicates if this container contains at least one character that is not a whitespace. All parts (segments and non-segments) are checked.- Parameters:
whiteSpacesAreText
- indicates if whitespaces should be considered text characters or not.- Returns:
- true if this container contains at least one character that is not a whitespace.
-
hasText
public boolean hasText()
Indicates if this fragment contains at least one character that is 'text' (inline codes, segment markers, and annotation markers do not count as 'text' characters). This method has the same result as callinghasText(boolean, boolean)
with the parameters true and false.- Returns:
- true if this container contains at least one character that is not a whitespace.
-
isEmpty
public boolean isEmpty()
Indicates if this container is empty (no text and no codes).- Returns:
- true if this container is empty.
-
hasCode
public boolean hasCode()
Indicates if this container hasCode
s.- Returns:
- true if this container has codes.
-
get
public TextPart get(int index)
Gets the part (segment or non-segment) for a given part index.- Parameters:
index
- the index of the part to retrieve. the first part has the index 0, the second has the index 1, etc.- Returns:
- the part (segment or non-segment) for the given index.
- Throws:
IndexOutOfBoundsException
- if the index is out of bounds.- See Also:
ISegments.get(int)
-
count
public int count()
Gets the number of parts (segments and non-segments) in this container. This method always returns at least 1.- Returns:
- the number of parts (segments and non-segments) in this container.
- See Also:
ISegments.count()
-
createJoinedContent
public TextFragment createJoinedContent(boolean keepCodeIds)
-
createJoinedContent
public TextFragment createJoinedContent()
-
joinAll
public void joinAll()
Merges back together all parts (segments and non-segments) of this container, and clear the list of segments. The content becomes a single segment content. WARNING: All TextPart annotations and Properties are lost after joining
-
joinWithNext
public int joinWithNext(int partIndex, int partCount)
Joins a given part with a specified number of its following parts.If the resulting part is the only part in the container and is not a segment, it is set automatically changed into a segment.
joinWithNext(0, -1) has the same effect as joinAll();
- Parameters:
partIndex
- the index of the part where to append the following parts.partCount
- the number of parts to join. You can use -1 to indicate all the parts after the initial one.- Returns:
- the number of parts joined to the given part (and removed from the list of parts).
-
-