Class RegexPlainTextFilter

  • All Implemented Interfaces:
    AutoCloseable, Iterator<Event>, IFilter

    public class RegexPlainTextFilter
    extends AbstractFilter
    PlainTextFilter extracts lines of input text, separated by line terminators. The filter is aware of the following line terminators:
    • Carriage return character followed immediately by a newline character ("\r\n")
    • Newline (line feed) character ("\n")
    • Stand-alone carriage return character ("\r")
    • Next line character ("…")
    • Line separator character ("
")
    • Paragraph separator character ("
").

    Version:
    0.1, 09.06.2009
    • Constructor Detail

      • RegexPlainTextFilter

        public RegexPlainTextFilter()
    • Method Detail

      • setRule

        public void setRule​(String rule,
                            int sourceGroup,
                            int regexOptions)
        Configures an internal line extractor. If you want to set a custom rule, call this method with a modified rule.

        Parameters:
        rule - - Java regex rule used to extract lines of text. Default: "^(.*?)$".
        sourceGroup - - regex capturing group denoting text to be extracted. Default: 1.
        regexOptions - - Java regex options. Default: Pattern.MULTILINE.
      • getRegexParameters

        public Parameters getRegexParameters()
        Provides access to the internal line extractor's Parameters object.
        Returns:
        Parameters object; with this object you can access the line extraction rule, source group, regex options, etc.
      • close

        public void close()
        Description copied from interface: IFilter
        Closes the input document. Developers should call this method from within their code before sending the last event: This can allow writer objects to overwrite the input file when they receive the last event. This method must also be safe to call even if the input document is not opened.
        Specified by:
        close in interface AutoCloseable
        Specified by:
        close in interface IFilter
        Overrides:
        close in class AbstractFilter
      • getName

        public String getName()
        Description copied from interface: IFilter
        Gets the name/identifier of this filter.
        Specified by:
        getName in interface IFilter
        Overrides:
        getName in class AbstractFilter
        Returns:
        The name/identifier of the filter.
      • hasNext

        public boolean hasNext()
        Description copied from interface: IFilter
        Indicates if there is an event to process.

        Implementer Note: The caller must be able to call this method several times without changing state.

        Returns:
        True if there is at least one event to process, false if not.
      • next

        public Event next()
        Description copied from interface: IFilter
        Gets the next event available. Calling this method can be done only once on each event.
        Returns:
        The next event available or null if there are no events.
      • open

        public void open​(RawDocument input)
        Description copied from interface: IFilter
        Opens the input document described in a give RawDocument object. Skeleton information is always created when you use this method.
        Parameters:
        input - The RawDocument object to use to open the document.
      • open

        public void open​(RawDocument input,
                         boolean generateSkeleton)
        Description copied from interface: IFilter
        Opens the input document described in a give RawDocument object, and optionally creates skeleton information.
        Specified by:
        open in interface IFilter
        Overrides:
        open in class AbstractFilter
        Parameters:
        input - The RawDocument object to use to open the document.
        generateSkeleton - true to generate the skeleton data, false otherwise.