Package net.sf.okapi.filters.xmlstream
Class XmlStreamFilter
- java.lang.Object
-
- net.sf.okapi.common.filters.AbstractFilter
-
- net.sf.okapi.filters.abstractmarkup.AbstractMarkupFilter
-
- net.sf.okapi.filters.xmlstream.XmlStreamFilter
-
- All Implemented Interfaces:
AutoCloseable,Iterator<Event>,IFilter
public class XmlStreamFilter extends AbstractMarkupFilter
-
-
Field Summary
-
Fields inherited from interface net.sf.okapi.common.filters.IFilter
SUB_FILTER
-
-
Constructor Summary
Constructors Constructor Description XmlStreamFilter()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected TaggedFilterConfigurationgetConfig()Get the currentTaggedFilterConfiguration.ParametersgetParameters()Gets the current parameters for this filter.protected voidhandleText(CharSequence text)An XML procesor must behave as if it normalized all line breaks by translating \r\n, and any \r not followed by \n, to a single \n character: https://www.w3.org/TR/xml/#sec-line-endsprotected StringnormalizeAttributeName(String attrName, String attrValue, net.htmlparser.jericho.Tag tag)Some attributes names are converted to Okapi standards such as HTML charset to "encoding" and lang to "language"protected voidsetNewlineType(String newlineType)All new line types are normalized to \n in the XML processor.voidsetParameters(IParameters params)Sets new parameters for this filter.voidsetParametersFromURL(URL config)Initialize filter parameters from a URL.protected voidstartFilter()Initialize rule state and parser.-
Methods inherited from class net.sf.okapi.filters.abstractmarkup.AbstractMarkupFilter
addCodeToCurrentTextUnit, addCodeToCurrentTextUnit, addToDocumentPart, addToTextUnit, addToTextUnit, addToTextUnit, addToTextUnit, canStartNewTextUnit, close, createEventBuilder, createPropertyTextUnitPlaceholder, createPropertyTextUnitPlaceholders, detectEncoding, determineTagType, disambiguateElementRuleTypes, endDocumentPart, endFilter, endGroup, endTextUnit, getCurrentDocName, getEventBuilder, getMainAttributeRule, getMainElementRule, getParsedHeader, getRuleState, getRuleTypeFromStartTag, getTextUnitId, handleCdataSection, handleCharacterEntity, handleComment, handleDocTypeDeclaration, handleDocumentPart, handleEndTag, handleNumericEntity, handleProcessingInstruction, handleServerCommon, handleServerCommonEscaped, handleStartTag, handleXmlDeclaration, hasNext, isBOM, isDocumentEncoding, isInline, isInsideTextRun, isMatchedTag, isPreserveWhitespace, isUtf8Bom, isUtf8Encoding, isWhiteSpace, next, open, open, peekTempEvent, postProcessTextUnit, preProcess, setCurrentDocName, setDocumentPartId, setMimeType, setPreserveWhitespace, setTextUnitMimeType, setTextUnitName, setTextUnitType, startDocumentPart, startGroup, startGroup, startTextUnit, startTextUnit, startTextUnit, startTextUnit, updateEndTagRuleState, updateStartTagRuleState
-
Methods inherited from class net.sf.okapi.common.filters.AbstractFilter
addConfiguration, addConfiguration, addConfiguration, addConfigurations, cancel, createEndFilterEvent, createFilterWriter, createSkeletonWriter, createStartFilterEvent, findConfiguration, getConfiguration, getConfigurations, getDisplayName, getDocumentId, getDocumentName, getEncoderManager, getEncoding, getFilterConfigurationMapper, getMimeType, getName, getNewlineType, getParameters, getParametersClassName, getParentId, getSrcLoc, getTrgLoc, isCanceled, isGenerateSkeleton, isMultilingual, removeConfiguration, setDisplayName, setDocumentName, setEncoding, setFilterConfigurationMapper, setGenerateSkeleton, setMultilingual, setName, setOptions, setParentId, setSrcLoc, setTrgLoc
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface java.util.Iterator
forEachRemaining, remove
-
-
-
-
Method Detail
-
startFilter
protected void startFilter()
Initialize rule state and parser. Called before processing of each input.- Overrides:
startFilterin classAbstractMarkupFilter
-
setNewlineType
protected void setNewlineType(String newlineType)
All new line types are normalized to \n in the XML processor.- Overrides:
setNewlineTypein classAbstractFilter- Parameters:
newlineType- one of '\n', '\r' or '\r\n'.
-
handleText
protected void handleText(CharSequence text)
An XML procesor must behave as if it normalized all line breaks by translating \r\n, and any \r not followed by \n, to a single \n character: https://www.w3.org/TR/xml/#sec-line-ends- Overrides:
handleTextin classAbstractMarkupFilter
-
normalizeAttributeName
protected String normalizeAttributeName(String attrName, String attrValue, net.htmlparser.jericho.Tag tag)
Description copied from class:AbstractMarkupFilterSome attributes names are converted to Okapi standards such as HTML charset to "encoding" and lang to "language"- Specified by:
normalizeAttributeNamein classAbstractMarkupFilter- Parameters:
attrName- - the attribute nameattrValue- - the attribute valuetag- - the JerichoTagthat contains the attribute- Returns:
- the attribute name after it as passe through the normalization rules
-
getConfig
protected TaggedFilterConfiguration getConfig()
Description copied from class:AbstractMarkupFilterGet the currentTaggedFilterConfiguration. A TaggedFilterConfiguration is the result of reading in a YAML configuration file and converting it into Java Objects.- Specified by:
getConfigin classAbstractMarkupFilter- Returns:
- a
TaggedFilterConfiguration
-
setParameters
public void setParameters(IParameters params)
Description copied from interface:IFilterSets new parameters for this filter.- Specified by:
setParametersin interfaceIFilter- Overrides:
setParametersin classAbstractFilter- Parameters:
params- The new parameters to use.
-
getParameters
public Parameters getParameters()
Description copied from interface:IFilterGets the current parameters for this filter.- Specified by:
getParametersin interfaceIFilter- Overrides:
getParametersin classAbstractFilter- Returns:
- The current parameters for this filter, or
DefaultParametersif this filter has no parameters.
-
setParametersFromURL
public void setParametersFromURL(URL config)
Initialize filter parameters from a URL.- Parameters:
config-
-
-