Class TextFragment

    • Field Summary

      Fields 
      Modifier and Type Field Description
      static int CHARBASE
      Special value used as the base of inline code indices.
      protected List<Code> codes
      List of the inline codes for this fragment.
      protected boolean isBalanced
      Flag indicating if the opening/closing inline codes of this fragment have been balanced or not.
      protected int lastCodeID
      Value of the last inline code ID in this fragment.
      static int MARKER_CLOSING
      Special character marker for a closing inline code.
      static int MARKER_ISOLATED
      Special character marker for an isolated inline code.
      static int MARKER_OPENING
      Special character marker for a opening inline code.
      static Pattern MARKERS_REGEX  
      static String REFMARKER_END
      Marker for end of reference.
      static String REFMARKER_SEP
      Marker for reference separator.
      static String REFMARKER_START
      Marker for start of reference.
      protected StringBuilder text
      Coded text buffer of this fragment.
    • Constructor Summary

      Constructors 
      Constructor Description
      TextFragment()
      Creates an empty TextFragment.
      TextFragment​(String text)
      Creates a TextFragment with a given text.
      TextFragment​(String text, int lastCodeId)
      Creates a TextFragment with a given text and an initial id value for codes.
      TextFragment​(String codedText, List<Code> codes)
      Creates a TextFragment with the content made of a given coded text and a list of codes.
      TextFragment​(TextFragment fragment)
      Creates a TextFragment with the content of a given TextFragment.
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      void alignCodeIds​(TextFragment base)  
      void alignCodeIds​(TextFragment base, CodeMatchStrategy strategy)
      Aligns the code IDs of this fragment with the ones of a given fragment.
      int annotate​(int start, int end, String type, InlineAnnotation annotation)
      Annotates a section of this text.
      TextFragment append​(char value)
      Appends a character to the fragment.
      TextFragment append​(CharSequence csq)
      Appends the specified character sequence to this fragment.
      TextFragment append​(CharSequence csq, int start, int end)
      Appends a subsequence of the specified character sequence to this fragment.
      void append​(CharSequence text, Function<Code,​Code> codeProcessor)
      Appends a CharSequence.
      TextFragment append​(String text)
      Appends a string to the fragment.
      void append​(String text, Function<Code,​Code> codeProcessor)
      Appends a string.
      TextFragment append​(Code code)
      Appends an existing code to this fragment.
      TextFragment append​(TextFragment fragment)
      Appends a TextFragment object to this fragment.
      Code append​(TextFragment.TagType tagType, String type, String data)
      Appends a new code to the text.
      Code append​(TextFragment.TagType tagType, String type, String data, int id)
      Appends a new code to the text, when the code has a defined identifier.
      Code append​(TextFragment.TagType tagType, String type, InlineAnnotation annotation)
      Appends an annotation-type code to this text.
      TextFragment append​(TextFragment fragment, boolean keepCodeIds)
      Appends a TextFragment object to this fragment.
      void balanceMarkers()
      Balances the markers based on the tag type of the codes.
      int changeToCode​(int start, int end, TextFragment.TagType tagType, String type)
      Changes a section of the coded text into a single code.
      int changeToCode​(int start, int end, TextFragment.TagType tagType, String type, boolean setDisplayText)
      Changes a section of the coded text into a single code.
      char charAt​(int index)
      Returns the character at the specified index in the coded text of this fragment.
      TextFragment cleanCodes()
      Removes all codes both in the Codes list and the markers.
      void cleanUnusedCodes()
      Removes all codes that have no data and no annotation.
      void clear()
      Clears the fragment of all content.
      TextFragment clone()
      Clones this TextFragment.
      int compareTo​(TextFragment tf)
      Compares an object with this TextFragment.
      int compareTo​(TextFragment frag, TextFragment.CompareMode compMode)
      Compares with another TextFragment.
      boolean equals​(Object object)  
      int findClosingCodePosition​(int id, int indexOfOpening)
      Finds the position in this coded text of the closing code for a given opening code.
      int findOpeningCodePosition​(int id, int indexOfClosing)
      Finds the position in this coded text of the opening code for a give closing code.
      static int fromFragmentToString​(TextFragment frag, int pos)
      Gets the position in the string representation of a fragment of a given position in that fragment.
      List<AnnotatedSpan> getAnnotatedSpans​(String type)
      Gets the list of all spans of text annotated with a given type of annotation.
      List<Code> getClonedCodes()
      Gets a list of the copy of the codes for this fragment.
      Code getCode​(char indexAsChar)
      Gets the code for a given index formatted as character (the second special character in a marker in a coded text string).
      Code getCode​(int index)
      Gets the code for a given index.
      Code getCode​(Code fc)
      Finds the first code with a given ID and tagType in this fragment, or null if there is no such code.
      String getCodedText()
      Gets the coded text representation of the fragment.
      String getCodedText​(int start, int end)
      Gets the portion of coded text for a given section of the coded text.
      int getCodePosition​(int index)  
      List<Code> getCodes()
      Gets the list of all codes for the fragment.
      List<Code> getCodes​(int start, int end)
      Gets a copy of the list of the codes that are within a given section of coded text.
      int getIndex​(int id)
      Gets the index value for the first in-line code (in the codes list) with a given identifier.
      int getIndexForClosing​(int id)
      Gets the index value for the closing in-line code (in the codes list) with a given identifier.
      int getIndexForOpening​(int id)
      Gets the index value for the opening in-line code (in the codes list) with a given identifier.
      Code getLastCode()
      Return the last code appended to this fragment, or null if there are no codes.
      int getLastCodeId()
      Gets the last value used for code id.
      static Object[] getRefMarker​(StringBuilder text)
      Helper method to retrieve a reference marker from a string.
      String getText()
      Get the text of the fragment (all codes are removed)
      static String getText​(String codedText)
      Helper method that will take a coded string and return a text only version.
      boolean hasAnnotation()
      Indicates if this text has at least one annotation.
      boolean hasAnnotation​(String type)
      Indicates if this text has at least one annotation of a given type.
      boolean hasCode()
      Indicates if the fragment contains at least one code.
      int hashCode()  
      boolean hasReference()
      Indicates if this TextFragment contains any in-line code with a reference.
      boolean hasText()
      Indicates if this fragment contains at least one character other than a whitespace.
      boolean hasText​(boolean whiteSpacesAreText)
      Indicates if this fragment contains at least one character (inline codes, segment markers, and annotation markers do not count as characters).
      static int indexOfFirstNonWhitespace​(String codedText, int fromIndex, int untilIndex, boolean openingMarkerIsWS, boolean closingMarkerIsWS, boolean isolatedMarkerIsWS, boolean whitespaceIsWS)
      Helper method to find the first non-whitespace character of a coded text, starting at a given position and no farther than another given position.
      static int indexOfLastNonWhitespace​(String codedText, int fromIndex, int untilIndex, boolean openingMarkerIsWS, boolean closingMarkerIsWS, boolean isolatedMarkerIsWS, boolean whitespaceIsWS)
      Helper method to find, from the back, the first non-whitespace character of a coded text, starting at a given position and no farther than another given position.
      void insert​(int offset, String str)
      Inserts a String object to this fragment.
      void insert​(int offset, Code code)
      Inserts a Code object to this fragment.
      void insert​(int offset, TextFragment fragment)
      Inserts a TextFragment object to this fragment.
      void insert​(int offset, TextFragment fragment, boolean keepCodeIds)
      Inserts a TextFragment object to this fragment.
      void invalidate()
      Sets the fragment in a state where it has to be re-balanced before being used for output.
      boolean isEmpty()
      Indicates if the fragment is empty (no text and no codes).
      static boolean isMarker​(char ch)
      Helper method that checks if a given character is an inline code marker.
      int length()
      Returns the number of character in the coded text of this fragment.
      static String makeRefMarker​(String id)
      Helper method to build a reference marker string from a given identifier.
      static String makeRefMarker​(String id, String propertyName)
      Helper method to build a reference marker string from a given identifier and a property name.
      int minimumIdValue()
      Returns the smallest id value
      void remove​(int start, int end)
      Removes a section of the fragment (including its codes).
      void removeAnnotations()
      Removes all annotations in this text.
      void removeAnnotations​(String type)
      Removes all annotations of a given type in this text.
      void removeCode​(Code code)
      Remove the Code from thios fragment
      int renumberCodes()
      Renumbers the IDs of the codes in the fragment.
      int renumberCodes​(int idBase)
      Re-assigns IDs of the codes in this fragment to be in a sequential order starting from a given base.
      int renumberCodes​(int idBase, boolean mindPosition)
      Re-assigns IDs of the codes in this fragment to be in a sequential order starting from a given base.
      void setCodedText​(String newCodedText)
      Sets the coded text of the fragment, using its the existing codes.
      void setCodedText​(String newCodedText, boolean allowCodeDeletion)
      Sets the coded text of the fragment, using its the existing codes.
      void setCodedText​(String newCodedText, List<Code> newCodes)
      Sets the coded text of the fragment and its corresponding codes.
      void setCodedText​(String newCodedText, List<Code> newCodes, boolean allowCodeDeletion)
      Sets the coded text of the fragment and its corresponding codes.
      protected void setCodes​(List<Code> codes)  
      TextFragment subSequence​(int start, int end)
      Gets a copy of a sub-sequence of this object.
      static char toChar​(int index)
      Helper method to convert a marker index to its character value in the coded text string.
      static int toIndex​(char index)
      Helper method to convert the index-coded-as-character part of a marker into its index value.
      String toOuterText()  
      String toString()
      Gets the coded text for this fragment.
      String toText()
      Returns the content of this fragment, including the original codes whenever possible.
      static void unwrap​(TextFragment frag)
      Unwraps the content of a TextFragment.
    • Field Detail

      • MARKER_OPENING

        public static final int MARKER_OPENING
        Special character marker for a opening inline code.
        See Also:
        Constant Field Values
      • MARKER_CLOSING

        public static final int MARKER_CLOSING
        Special character marker for a closing inline code.
        See Also:
        Constant Field Values
      • MARKER_ISOLATED

        public static final int MARKER_ISOLATED
        Special character marker for an isolated inline code.
        See Also:
        Constant Field Values
      • CHARBASE

        public static final int CHARBASE
        Special value used as the base of inline code indices.
        See Also:
        Constant Field Values
      • MARKERS_REGEX

        public static final Pattern MARKERS_REGEX
      • text

        protected StringBuilder text
        Coded text buffer of this fragment.
      • codes

        protected List<Code> codes
        List of the inline codes for this fragment.
      • isBalanced

        protected boolean isBalanced
        Flag indicating if the opening/closing inline codes of this fragment have been balanced or not.
      • lastCodeID

        protected int lastCodeID
        Value of the last inline code ID in this fragment.
    • Constructor Detail

      • TextFragment

        public TextFragment()
        Creates an empty TextFragment.
      • TextFragment

        public TextFragment​(String text)
        Creates a TextFragment with a given text.
        Parameters:
        text - the text to use.
      • TextFragment

        public TextFragment​(String text,
                            int lastCodeId)
        Creates a TextFragment with a given text and an initial id value for codes. This constructor can be used to create fragments that will be appended to an existing one.
        Parameters:
        text - the text to use.
        lastCodeId - value to use to start the code id. The first new code will have for id this value+1. The value should be -1 or a positive number. Values below -1 will be automatically reset to -1.
      • TextFragment

        public TextFragment​(TextFragment fragment)
        Creates a TextFragment with the content of a given TextFragment.
        Parameters:
        fragment - the content to use.
      • TextFragment

        public TextFragment​(String codedText,
                            List<Code> codes)
        Creates a TextFragment with the content made of a given coded text and a list of codes.
        Parameters:
        codedText - the coded text.
        codes - the list of codes.
    • Method Detail

      • toChar

        public static char toChar​(int index)
        Helper method to convert a marker index to its character value in the coded text string.
        Parameters:
        index - the index value to encode.
        Returns:
        the corresponding character value.
      • toIndex

        public static int toIndex​(char index)
        Helper method to convert the index-coded-as-character part of a marker into its index value.
        Parameters:
        index - the character to decode.
        Returns:
        the corresponding index value.
      • makeRefMarker

        public static String makeRefMarker​(String id)
        Helper method to build a reference marker string from a given identifier.
        Parameters:
        id - the identifier to use.
        Returns:
        the reference marker constructed from the ID.
      • makeRefMarker

        public static String makeRefMarker​(String id,
                                           String propertyName)
        Helper method to build a reference marker string from a given identifier and a property name. The identifier and the property "\" and "]" symbols are escaped with "\".
        Parameters:
        id - The identifier to use.
        propertyName - the name of the property to use.
        Returns:
        the reference marker constructed from the identifier and the property name.
      • getRefMarker

        public static Object[] getRefMarker​(StringBuilder text)
        Helper method to retrieve a reference marker from a string. The identifier and the property parts are unescaped.
        Parameters:
        text - the text to search for a reference marker.
        Returns:
        null if no reference marker has been found. An array of four objects if a reference marker has been found:
        • Object 0: The identifier of the reference.
        • Object 1: The start position of the reference marker in the string.
        • Object 2: The end position of the reference marker in the string.
        • Object 3: The name of the property if there is one, null otherwise.
      • fromFragmentToString

        public static int fromFragmentToString​(TextFragment frag,
                                               int pos)
        Gets the position in the string representation of a fragment of a given position in that fragment.

        For example if you find a match in a coded text string, use this method to convert the boundaries of the match into character position in the string representing the fragment (4 in "xxyyMATCHyyxx" -> 6 in "{b}{i}MATCH{/i}{/b}")

        Parameters:
        frag - the fragment where the position is located.
        pos - the position.
        Returns:
        the same position, but in the string representation of the fragment.
      • indexOfLastNonWhitespace

        public static int indexOfLastNonWhitespace​(String codedText,
                                                   int fromIndex,
                                                   int untilIndex,
                                                   boolean openingMarkerIsWS,
                                                   boolean closingMarkerIsWS,
                                                   boolean isolatedMarkerIsWS,
                                                   boolean whitespaceIsWS)
        Helper method to find, from the back, the first non-whitespace character of a coded text, starting at a given position and no farther than another given position.
        Parameters:
        codedText - the coded text to process.
        fromIndex - the first position to check (must be greater or equal to untilIndex). Use -1 to point to the last position of the text.
        untilIndex - The last position to check (must be lesser or equal to fromIndex).
        openingMarkerIsWS - indicates if opening markers count as whitespace.
        closingMarkerIsWS - indicates if closing markers count as whitespace.
        isolatedMarkerIsWS - indicates if isolated markers count as whitespace.
        whitespaceIsWS - indicates if whitespace characters count as whitespace.
        Returns:
        the first non-whitespace character position from the back, given the parameters, or -1 if the text in null, empty or if no non-whitespace has been found after the character at the position untilIndex has been checked. If the last non-whitespace found is a code, the position returned is the index of the second special character marker for that code.
      • indexOfFirstNonWhitespace

        public static int indexOfFirstNonWhitespace​(String codedText,
                                                    int fromIndex,
                                                    int untilIndex,
                                                    boolean openingMarkerIsWS,
                                                    boolean closingMarkerIsWS,
                                                    boolean isolatedMarkerIsWS,
                                                    boolean whitespaceIsWS)
        Helper method to find the first non-whitespace character of a coded text, starting at a given position and no farther than another given position.
        Parameters:
        codedText - the coded text to process.
        fromIndex - the first position to check (must be lesser or equal to untilIndex).
        untilIndex - the last position to check (must be greater or equal to fromIndex). Use -1 to point to the last position of the text.
        openingMarkerIsWS - indicates if opening markers count as whitespace.
        closingMarkerIsWS - indicates if closing markers count as whitespace.
        isolatedMarkerIsWS - indicates if isolated markers count as whitespace.
        whitespaceIsWS - indicates if whitespace characters count as whitespace.
        Returns:
        the first non-whitespace character position, given the parameters, or -1 if the text is null or empty, or no non-whitespace has been found after the character at the position untilIndex has been checked.
      • unwrap

        public static void unwrap​(TextFragment frag)
        Unwraps the content of a TextFragment. All sequences of consecutive white spaces are replaced by a single space characters, and any white spaces at the head or the end of the text is trimmed out. White spaces here are: space, tab, CR and LF. Existing segments are not unwrapped.
        Parameters:
        frag - the text fragment to unwrap.
      • isMarker

        public static boolean isMarker​(char ch)
        Helper method that checks if a given character is an inline code marker.
        Parameters:
        ch - the character to check.
        Returns:
        true if the character is a code marker, false if it is not.
      • getText

        public static String getText​(String codedText)
        Helper method that will take a coded string and return a text only version.
        Parameters:
        codedText - string with possible TextFragment codes.
        Returns:
        the given string stripped out of any codes.
      • clone

        public TextFragment clone()
        Clones this TextFragment.
        Overrides:
        clone in class Object
        Returns:
        a new TextFragment that is a copy of this one.
      • hasReference

        public boolean hasReference()
        Indicates if this TextFragment contains any in-line code with a reference.
        Returns:
        true if there is one or more in-line codes with a reference, false if there is no reference.
      • append

        public TextFragment append​(String text)
        Appends a string to the fragment. If the string is null, it is ignored.
        Parameters:
        text - the string to append.
      • append

        public void append​(String text,
                           Function<Code,​Code> codeProcessor)
        Appends a string. If the string is null, it is ignored. If the string contains okapi markers (Unicode 0xE101, 0xE102 or 0xE103) they are replaced by a Code "masking" the markers (which will result in MARKER_OPENING (0xE101) being in the coded text).
        Parameters:
        text - the string to append.
        codeProcessor - when a Code is generated to mask an Okapi marker this function will be called on it and can modify or replace the generated code
      • append

        public void append​(CharSequence text,
                           Function<Code,​Code> codeProcessor)
        Appends a CharSequence. If the string is null, it is ignored. If the sequence contains okapi markers (Unicode 0xE101, 0xE102 or 0xE103) they are replaced by a Code "masking" the markers (which will result in MARKER_OPENING (0xE101) being in the coded text).
        Parameters:
        text - the string to append.
        codeProcessor - when a Code is generated to mask an Okapi marker this function will be called on it and can modify or replace the generated code
      • append

        public TextFragment append​(TextFragment fragment)
        Appends a TextFragment object to this fragment. If the fragment is null, it is ignored.
        Parameters:
        fragment - the TextFragment to append.
        Returns:
        this fragment.
      • append

        public TextFragment append​(TextFragment fragment,
                                   boolean keepCodeIds)
        Appends a TextFragment object to this fragment. If the fragment is null, it is ignored.
        Parameters:
        fragment - the TextFragment to append.
        keepCodeIds - if true do not renumber Code.id
        Returns:
        this fragment.
      • append

        public TextFragment append​(Code code)
        Appends an existing code to this fragment.
        Parameters:
        code - the existing code to append.
        Returns:
        a reference to this fragment
      • append

        public Code append​(TextFragment.TagType tagType,
                           String type,
                           InlineAnnotation annotation)
        Appends an annotation-type code to this text.
        Parameters:
        tagType - the tag type of the code (e.g. TagType.OPENING).
        type - the type of the annotation (e.g. "protected").
        annotation - the annotation to add (can be null).
        Returns:
        the new code that was added to this text.
      • append

        public Code append​(TextFragment.TagType tagType,
                           String type,
                           String data)
        Appends a new code to the text.
        Parameters:
        tagType - the tag type of the code (e.g. TagType.OPENING).
        type - the type of the code (e.g. "bold").
        data - the raw code itself. (e.g. "<b>").
        Returns:
        the new code that was added to the text.
      • append

        public Code append​(TextFragment.TagType tagType,
                           String type,
                           String data,
                           int id)
        Appends a new code to the text, when the code has a defined identifier.
        Parameters:
        tagType - the tag type of the code (e.g. TagType.OPENING).
        type - the type of the code (e.g. "bold").
        data - the raw code itself. (e.g. "<b>").
        id - the identifier to use for this code.
        Returns:
        the new code that was added to the text.
      • insert

        public void insert​(int offset,
                           String str)
        Inserts a String object to this fragment.
        Parameters:
        offset - position in the coded text where to insert the new String. You can use -1 to append at the end of the current content.
        str - String to insert.
        Throws:
        InvalidPositionException - when offset points inside a marker.
      • insert

        public void insert​(int offset,
                           Code code)
        Inserts a Code object to this fragment.
        Parameters:
        offset - position in the coded text where to insert the new Code. You can use -1 to append at the end of the current content.
        code - Code to insert.
        Throws:
        InvalidPositionException - when offset points inside a marker.
      • insert

        public void insert​(int offset,
                           TextFragment fragment)
        Inserts a TextFragment object to this fragment.
        Parameters:
        offset - position in the coded text where to insert the new fragment. You can use -1 to append at the end of the current content.
        fragment - the TextFragment to insert.
        Throws:
        InvalidPositionException - when offset points inside a marker.
      • insert

        public void insert​(int offset,
                           TextFragment fragment,
                           boolean keepCodeIds)
        Inserts a TextFragment object to this fragment.
        Parameters:
        offset - position in the coded text where to insert the new fragment. You can use -1 to append at the end of the current content.
        fragment - the TextFragment to insert.
        keepCodeIds - true to not change Ids of the codes of the inserted TextFragment.
      • clear

        public void clear()
        Clears the fragment of all content. The parent is not modified.
      • getText

        public String getText()
        Get the text of the fragment (all codes are removed)
        Returns:
        the content of fragment without codes
      • getCodedText

        public String getCodedText()
        Gets the coded text representation of the fragment.
        Returns:
        the coded text for the fragment.
      • setCodedText

        public void setCodedText​(String newCodedText)
        Sets the coded text of the fragment, using its the existing codes. The coded text must be valid for the existing codes.
        Parameters:
        newCodedText - the coded text to apply.
        Throws:
        InvalidContentException - when the coded text is not valid, or does not correspond to the existing codes.
      • getCodedText

        public String getCodedText​(int start,
                                   int end)
        Gets the portion of coded text for a given section of the coded text.
        Parameters:
        start - the position of the first character or marker of the section (in the coded text representation).
        end - The position just after the last character or marker of the section (in the coded text representation). You can use -1 for ending the section at the end of the fragment.
        Returns:
        the portion of coded text for the given range. It can be empty but never null.
        Throws:
        InvalidPositionException - when start or end points inside a marker.
      • getCode

        public Code getCode​(char indexAsChar)
        Gets the code for a given index formatted as character (the second special character in a marker in a coded text string).
        Parameters:
        indexAsChar - the index value coded as character.
        Returns:
        the corresponding code.
      • getCode

        public Code getCode​(int index)
        Gets the code for a given index.
        Parameters:
        index - the index of the code.
        Returns:
        the code for the given index.
      • getCodes

        public List<Code> getCodes()
        Gets the list of all codes for the fragment.
        Returns:
        the list of all codes for the fragment. If there is no code, an empty list is returned.
      • setCodes

        protected void setCodes​(List<Code> codes)
      • getClonedCodes

        public List<Code> getClonedCodes()
        Gets a list of the copy of the codes for this fragment.
        Returns:
        the list of the copy of the codes for this fragment. If there is no code, an empty list is returned.
      • getCodes

        public List<Code> getCodes​(int start,
                                   int end)
        Gets a copy of the list of the codes that are within a given section of coded text.
        Parameters:
        start - the position of the first character or marker of the section (in the coded text representation).
        end - the position just after the last character or marker of the section (in the coded text representation).
        Returns:
        a new list of all codes within the given range.
        Throws:
        InvalidPositionException - when start or end points inside a marker.
      • getIndex

        public int getIndex​(int id)
        Gets the index value for the first in-line code (in the codes list) with a given identifier.
        Parameters:
        id - the identifier to look for.
        Returns:
        the index of the found code, or -1 if none is found.
      • getIndexForOpening

        public int getIndexForOpening​(int id)
        Gets the index value for the opening in-line code (in the codes list) with a given identifier.
        Parameters:
        id - the identifier of the opening tag to look for.
        Returns:
        the index of the found opening code, or -1 if none is found.
      • getIndexForClosing

        public int getIndexForClosing​(int id)
        Gets the index value for the closing in-line code (in the codes list) with a given identifier.
        Parameters:
        id - the identifier of the closing tag to look for.
        Returns:
        the index of the found closing code, or -1 if none is found.
      • isEmpty

        public boolean isEmpty()
        Indicates if the fragment is empty (no text and no codes).
        Returns:
        true if the fragment is empty.
      • hasText

        public boolean hasText()
        Indicates if this fragment contains at least one character other than a whitespace. (inline codes and other markers do not count as characters).
        Returns:
        true if this fragment contains at least one character, excluding whitespace.
      • hasText

        public boolean hasText​(boolean whiteSpacesAreText)
        Indicates if this fragment contains at least one character (inline codes, segment markers, and annotation markers do not count as characters).
        Parameters:
        whiteSpacesAreText - indicates if whitespaces should be considered characters or not for the purpose of checking if this fragment is empty.
        Returns:
        true if this fragment contains at least one character (that character could be a whitespace if whiteSpacesAreText is set to true).
      • hasCode

        public boolean hasCode()
        Indicates if the fragment contains at least one code.
        Returns:
        true if the fragment contains at least one code.
      • remove

        public void remove​(int start,
                           int end)
        Removes a section of the fragment (including its codes).
        Parameters:
        start - the position of the first character or marker of the section (in the coded text representation).
        end - the position just after the last character or marker of the section (in the coded text representation). You can use -1 to indicate the end of the fragment.
        Throws:
        InvalidPositionException - when start or end points inside a marker.
      • subSequence

        public TextFragment subSequence​(int start,
                                        int end)
        Gets a copy of a sub-sequence of this object.
        Specified by:
        subSequence in interface CharSequence
        Parameters:
        start - the position of the first character or marker of the section (in the coded text representation).
        end - the position just after the last character or marker of the section (in the coded text representation). You can use -1 for ending the section at the end of the fragment.
        Returns:
        a new TextFragment object with a copy of the given sub-sequence.
      • setCodedText

        public void setCodedText​(String newCodedText,
                                 boolean allowCodeDeletion)
        Sets the coded text of the fragment, using its the existing codes. The coded text must be valid for the existing codes.
        Parameters:
        newCodedText - The coded text to apply.
        allowCodeDeletion - True when missing in-line codes in the coded text means the corresponding codes should be deleted from the fragment.
        Throws:
        InvalidContentException - When the coded text is not valid, or does not correspond to the existing codes.
      • setCodedText

        public void setCodedText​(String newCodedText,
                                 List<Code> newCodes)
        Sets the coded text of the fragment and its corresponding codes.
        Parameters:
        newCodedText - the coded text to apply.
        newCodes - the list of the corresponding codes.
        Throws:
        InvalidContentException - when the coded text is not valid or does not correspond to the new codes.
      • setCodedText

        public void setCodedText​(String newCodedText,
                                 List<Code> newCodes,
                                 boolean allowCodeDeletion)
        Sets the coded text of the fragment and its corresponding codes.
        Parameters:
        newCodedText - the coded text to apply.
        newCodes - the list of the corresponding codes.
        allowCodeDeletion - True when missing in-line codes in the coded text means the corresponding codes should be deleted from the fragment.
        Throws:
        InvalidContentException - when the coded text is not valid or does not correspond to the new codes.
      • toString

        public String toString()
        Gets the coded text for this fragment. This method returns the same data as getCodedText().

        Each code is represented by a placeholder made of two special characters. To get the content with the codes expanded as their original data use toText().

        Specified by:
        toString in interface CharSequence
        Overrides:
        toString in class Object
        Returns:
        the coded text for this fragment.
      • toText

        public String toText()
        Returns the content of this fragment, including the original codes whenever possible. To get the coded text for this fragment use getCodedText() or toString().
        Returns:
        the content of this fragment.
      • toOuterText

        public String toOuterText()
      • compareTo

        public int compareTo​(TextFragment tf)
        Compares an object with this TextFragment. If the object is also a TextFragment, the method returns the same results as compareTo(fragment, CompareMode.IGNORE_CODE)
        Note that inline codes are not compared with this method but the markers and code indices embedded in the coded text are considered.
        Specified by:
        compareTo in interface Comparable<TextFragment>
        Parameters:
        tf - the object to compare with this TextFragment.
        Returns:
        a value 0 if the objects are equals.
      • compareTo

        public int compareTo​(TextFragment frag,
                             TextFragment.CompareMode compMode)
        Compares with another TextFragment. This first compares the text member of this and the other TextFragment and returns the result if they aren't equal.
        If the text members are equal, one of these actions is taken depending on compMode:
        • IGNORE_CODE: 0 is returned
        • CODE_DATA_ONLY: The data member of the Code in the codes array is concatenated for each TextFragment, and string comparison result is returned.
        • CODE_ALL: The codes array is processed by Codes.codesToString() for each TextFragment, and the result is returned.
        then codes are compared using List<Code>.toString(), if compMode == CODE_DATA_ONLY, or Code.codesToString(Code) otherwise. Note the former considers only the data member of the Code while the latter considers all the members of the Code.

        Caveat #1:
        The current implementation assumes that code indexes are in the normal ascending order in the coded text. For example, if
        tf1.text="ABC", tf1.codes={{tagType:OPENING,id:1,data:"<em>"}, {tagType:CLOSING,id:1,data:"</em>"}}
        and
        tf2.text="ABC", tf2.codes={{tagType:CLOSING,id:1,data:"</em>"}, {tagType:OPENING,id:1,data:"<em>"}}
        tf1.equals(tf2) returns false in all comparison modes, although they are semantically equal.

        Parameters:
        frag -
        compMode -
        Returns:
      • equals

        public final boolean equals​(Object object)
        Overrides:
        equals in class Object
      • hashCode

        public int hashCode()
        Overrides:
        hashCode in class Object
      • changeToCode

        public int changeToCode​(int start,
                                int end,
                                TextFragment.TagType tagType,
                                String type)
        Changes a section of the coded text into a single code. Any code already existing that is within the range will become part of the new code.
        Parameters:
        start - The position of the first character or marker of the section (in the coded text representation).
        end - the position just after the last character or marker of the section (in the coded text representation).
        tagType - the tag type of the new code.
        type - the type of the new code.
        Returns:
        the difference between the coded text length before and after the operation. This value can be used to adjust further start and end positions that have been calculated on the coded text before the changes are applied.
        Throws:
        InvalidPositionException - when start or end points inside a marker.
      • changeToCode

        public int changeToCode​(int start,
                                int end,
                                TextFragment.TagType tagType,
                                String type,
                                boolean setDisplayText)
        Changes a section of the coded text into a single code. Any code already existing that is within the range will become part of the new code.
        Parameters:
        start - The position of the first character or marker of the section (in the coded text representation).
        end - the position just after the last character or marker of the section (in the coded text representation).
        tagType - the tag type of the new code.
        type - the type of the new code.
        setDisplayText - if true set the subsequence (sub) as the displayText of the code
        Returns:
        the difference between the coded text length before and after the operation. This value can be used to adjust further start and end positions that have been calculated on the coded text before the changes are applied.
        Throws:
        InvalidPositionException - when start or end points inside a marker.
      • findClosingCodePosition

        public int findClosingCodePosition​(int id,
                                           int indexOfOpening)
        Finds the position in this coded text of the closing code for a given opening code.
        Parameters:
        id - identifier of the opening code.
        indexOfOpening - index of the opening code.
        Returns:
        the position in this text of the closing code for the given opening code, or -1 if it could not be found.
      • findOpeningCodePosition

        public int findOpeningCodePosition​(int id,
                                           int indexOfClosing)
        Finds the position in this coded text of the opening code for a give closing code.
        Parameters:
        id - identifier of the opening code.
        indexOfClosing - index of the opening code.
        Returns:
        the position in this text of the closing code for the given opening code, or -1 if it could not be found.
      • annotate

        public int annotate​(int start,
                            int end,
                            String type,
                            InlineAnnotation annotation)
        Annotates a section of this text.
        Parameters:
        start - the position of the first character or marker of the section to annotate (in the coded text representation).
        end - the position just after the last character or marker of the section to annotate (in the coded text representation).
        type - the type of annotation to set.
        annotation - the annotation to set (can be null).
        Returns:
        the difference between the coded text length before and after the operation. This value can be used to adjust further start and end positions that have been calculated on the coded text before the changes are applied.
        Throws:
        InvalidPositionException - when start or end points inside a marker.
      • removeAnnotations

        public void removeAnnotations()
        Removes all annotations in this text. This also removes any code that is or was there only for holding an annotation.
      • removeAnnotations

        public void removeAnnotations​(String type)
        Removes all annotations of a given type in this text. This also removes any code that is there only for holding an annotation of the given type, or any code that has no annotation and no data either.
        Parameters:
        type - the type of annotation to remove.
      • hasAnnotation

        public boolean hasAnnotation()
        Indicates if this text has at least one annotation.
        Returns:
        true if there is at least one annotation, false otherwise.
      • hasAnnotation

        public boolean hasAnnotation​(String type)
        Indicates if this text has at least one annotation of a given type.
        Parameters:
        type - the type of annotation to look for.
        Returns:
        true if there is at least one annotation of the given type, false otherwise.
      • cleanUnusedCodes

        public void cleanUnusedCodes()
        Removes all codes that have no data and no annotation.
      • cleanCodes

        public TextFragment cleanCodes()
        Removes all codes both in the Codes list and the markers.
        Returns:
        this TextFragment, with the codes removed
      • getCodePosition

        public int getCodePosition​(int index)
      • getAnnotatedSpans

        public List<AnnotatedSpan> getAnnotatedSpans​(String type)
        Gets the list of all spans of text annotated with a given type of annotation.
        Parameters:
        type - the type of annotation to look for.
        Returns:
        a list of annotated spans for the given type (it may be empty).
      • renumberCodes

        public int renumberCodes()
        Renumbers the IDs of the codes in the fragment.
        Returns:
        The last value used for code ID or 0 if this fragment has no codes.
      • renumberCodes

        public int renumberCodes​(int idBase)
        Re-assigns IDs of the codes in this fragment to be in a sequential order starting from a given base.
        Parameters:
        idBase - The base from which code IDs start numbering.
        Returns:
        The last value used for code ID or idBase-1 if this fragment has no codes.
      • renumberCodes

        public int renumberCodes​(int idBase,
                                 boolean mindPosition)
        Re-assigns IDs of the codes in this fragment to be in a sequential order starting from a given base.
        Parameters:
        idBase - The base from which code IDs start numbering.
        mindPosition - If true, the codes with lesser positions in this text fragment will have lesser IDs. If false, the codes with lesser original IDs will be assigned lesser IDs.
        Returns:
        The last value used for code ID or idBase-1 if this fragment has no codes.
      • removeCode

        public void removeCode​(Code code)
        Remove the Code from thios fragment
        Parameters:
        code - - the Code to remove
      • balanceMarkers

        public void balanceMarkers()
        Balances the markers based on the tag type of the codes. Closing codes can have -1 as their ID, they will get the Id of their matching opening, or a new ID if they are isolated. Closing codes with and existing id that found themselves isolated keep the same id. This method also resets the last code id value to the highest code id found. The method does nothing if the TextFragment is already balanced. To force it run its logic to a TextFragment which is already balanced, call invalidate() prior to calling this method.
      • alignCodeIds

        public void alignCodeIds​(TextFragment base,
                                 CodeMatchStrategy strategy)
        Aligns the code IDs of this fragment with the ones of a given fragment. This method re-assigns the IDs of the in-line codes of this fragment based on the code data of the provided fragment. If there is a code with the same data, then prefer the first code as this is the matching target code in the majority of cases. An example of usage is when source and target fragments have codes generated from regular expressions and not in the same order. For example if the source is %d equals %s and the target is %s equals %d and %s and %d are codes. You want their IDs to match for the code with the same content.
        Parameters:
        base - the fragment to use as the base for the synchronization.
      • alignCodeIds

        public void alignCodeIds​(TextFragment base)
      • append

        public TextFragment append​(char value)
        Appends a character to the fragment.
        Specified by:
        append in interface Appendable
        Parameters:
        value - the character to append.
        Returns:
        a reference to this fragment.
      • append

        public TextFragment append​(CharSequence csq)
        Appends the specified character sequence to this fragment.
        Specified by:
        append in interface Appendable
        Parameters:
        csq - the character sequence to append. If the parameter is null, the string "null" is appended.
        Returns:
        a reference to this fragment.
      • append

        public TextFragment append​(CharSequence csq,
                                   int start,
                                   int end)
        Appends a subsequence of the specified character sequence to this fragment.
        Specified by:
        append in interface Appendable
        Parameters:
        csq - the character sequence to append. If csq is null, then characters will be appended as if csq contained the string "null".
        start - the index of the first character in the subsequence.
        end - the index of the character following the last character in the subsequence.
        Returns:
        a reference to this fragment.
      • charAt

        public char charAt​(int index)
        Returns the character at the specified index in the coded text of this fragment. Each code in the coded text string take 2 characters, regardless of the size of the code.

        For example: If the fragment is "A[xy]B" and "[xy]" is a code, charAt(3) returns 'B' not 'x'.

        If the specified index falls on a code placeholder, the character returned is either a marker (first character of the placeholder) or a special index to access the underlying code (second character of the placeholder). Markers can be identified using isMarker(char).

        Specified by:
        charAt in interface CharSequence
        Parameters:
        index - the index of the character to be returned.
        Returns:
        the specified character.
        Throws:
        IndexOutOfBoundsException - if the if the index argument is negative or not less than the length of the coded text.
        See Also:
        isMarker(char)
      • length

        public int length()
        Returns the number of character in the coded text of this fragment.

        This is not the length of the content with all its codes. In the coded text, each code is represented by a placeholder made of two characters regardless of the size of the code. For example: If the fragment is "A[xy]B" and "[xy]" is a code, length() returns 4, not 6.

        To get the length of the content including codes use toText().length(). Note that codes with referenced are not expanded by toText().

        Specified by:
        length in interface CharSequence
        Returns:
        the number of character in the coded text of this fragment.
      • invalidate

        public void invalidate()
        Sets the fragment in a state where it has to be re-balanced before being used for output. This method is not harmful, but should preferably be used only when adding unbalanced paired codes.
      • getLastCodeId

        public int getLastCodeId()
        Gets the last value used for code id.
        Returns:
        the last value used for code id.
      • getLastCode

        public Code getLastCode()
        Return the last code appended to this fragment, or null if there are no codes.
        Returns:
        code, or null
      • getCode

        public Code getCode​(Code fc)
        Finds the first code with a given ID and tagType in this fragment, or null if there is no such code.
        Parameters:
        fc - the Code to look for.
        Returns:
        code, or null
      • minimumIdValue

        public int minimumIdValue()
        Returns the smallest id value
        Returns:
        the id with the smallest value or 0 if there are no codes