Class FormParseState
- java.lang.Object
-
- org.apache.manifoldcf.connectorcommon.fuzzyml.CharacterReceiver
-
- org.apache.manifoldcf.connectorcommon.fuzzyml.SingleCharacterReceiver
-
- org.apache.manifoldcf.connectorcommon.fuzzyml.TagParseState
-
- org.apache.manifoldcf.connectorcommon.fuzzyml.HTMLParseState
-
- org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState
-
- org.apache.manifoldcf.crawler.connectors.webcrawler.MetaParseState
-
- org.apache.manifoldcf.crawler.connectors.webcrawler.LinkParseState
-
- org.apache.manifoldcf.crawler.connectors.webcrawler.FormParseState
-
public class FormParseState extends LinkParseState
This class interprets the tag stream generated by the BasicParseState class, and keeps track of the form tags.
-
-
Field Summary
Fields Modifier and Type Field Description protected intformParseStateprotected static intFORMPARSESTATE_IN_FORMprotected static intFORMPARSESTATE_IN_OPTIONprotected static intFORMPARSESTATE_IN_SELECTprotected static intFORMPARSESTATE_IN_TEXTAREAprotected static intFORMPARSESTATE_NORMALprotected java.lang.StringoptionSelectedprotected java.lang.StringoptionValueprotected java.lang.StringBuilderoptionValueTextprotected java.lang.StringselectMultipleprotected java.lang.StringselectName-
Fields inherited from class org.apache.manifoldcf.crawler.connectors.webcrawler.LinkParseState
handler
-
Fields inherited from class org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState
scriptParseState, SCRIPTPARSESTATE_INSCRIPT, SCRIPTPARSESTATE_NORMAL
-
Fields inherited from class org.apache.manifoldcf.connectorcommon.fuzzyml.TagParseState
accumBuffer, ampBuffer, bTagDepth, currentAttrList, currentAttrName, currentAttrNameBuffer, currentState, currentTagName, currentTagNameBuffer, currentValueBuffer, inAmpersand, mapLookup, TAGPARSESTATE_IN_ATTR_LOOKING_FOR_VALUE, TAGPARSESTATE_IN_ATTR_NAME, TAGPARSESTATE_IN_ATTR_VALUE, TAGPARSESTATE_IN_BANG_TOKEN, TAGPARSESTATE_IN_BRACKET_TOKEN, TAGPARSESTATE_IN_CDATA_BODY, TAGPARSESTATE_IN_COMMENT, TAGPARSESTATE_IN_DOUBLE_QUOTES_ATTR_VALUE, TAGPARSESTATE_IN_END_TAG_NAME, TAGPARSESTATE_IN_QTAG_ATTR_LOOKING_FOR_VALUE, TAGPARSESTATE_IN_QTAG_ATTR_NAME, TAGPARSESTATE_IN_QTAG_ATTR_VALUE, TAGPARSESTATE_IN_QTAG_DOUBLE_QUOTES_ATTR_VALUE, TAGPARSESTATE_IN_QTAG_NAME, TAGPARSESTATE_IN_QTAG_SAW_QUESTION, TAGPARSESTATE_IN_QTAG_SINGLE_QUOTES_ATTR_VALUE, TAGPARSESTATE_IN_QTAG_UNQUOTED_ATTR_VALUE, TAGPARSESTATE_IN_SINGLE_QUOTES_ATTR_VALUE, TAGPARSESTATE_IN_TAG_NAME, TAGPARSESTATE_IN_TAG_SAW_SLASH, TAGPARSESTATE_IN_UNQUOTED_ATTR_VALUE, TAGPARSESTATE_IN_UNQUOTED_ATTR_VALUE_SAW_SLASH, TAGPARSESTATE_NEED_FINAL_BRACKET, TAGPARSESTATE_NORMAL, TAGPARSESTATE_SAWCOMMENTDASH, TAGPARSESTATE_SAWDASH, TAGPARSESTATE_SAWEXCLAMATION, TAGPARSESTATE_SAWLEFTANGLE, TAGPARSESTATE_SAWRIGHTBRACKET, TAGPARSESTATE_SAWSECONDCOMMENTDASH, TAGPARSESTATE_SAWSECONDRIGHTBRACKET
-
-
Constructor Summary
Constructors Constructor Description FormParseState(IHTMLHandler handler)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected booleannoteNonscriptEndTag(java.lang.String tagName)protected booleannoteNonscriptTag(java.lang.String tagName, java.util.Map<java.lang.String,java.lang.String> attributes)protected booleannoteNormalCharacter(char thisChar)-
Methods inherited from class org.apache.manifoldcf.crawler.connectors.webcrawler.LinkParseState
finishUp
-
Methods inherited from class org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState
acceptNewTag, noteTag, noteTagEnd
-
Methods inherited from class org.apache.manifoldcf.connectorcommon.fuzzyml.HTMLParseState
noteBTag, noteBTagToken, noteEndBTag, noteEndEscaped, noteEndTag, noteEscaped, noteEscapedCharacter, noteQTag, noteTag
-
Methods inherited from class org.apache.manifoldcf.connectorcommon.fuzzyml.TagParseState
attributeDecode, dealWithCharacter, dumpValues, isPunctuation, isWhitespace, mapChunk, newBuffer, outputAmpBuffer
-
-
-
-
Field Detail
-
FORMPARSESTATE_NORMAL
protected static final int FORMPARSESTATE_NORMAL
- See Also:
- Constant Field Values
-
FORMPARSESTATE_IN_FORM
protected static final int FORMPARSESTATE_IN_FORM
- See Also:
- Constant Field Values
-
FORMPARSESTATE_IN_SELECT
protected static final int FORMPARSESTATE_IN_SELECT
- See Also:
- Constant Field Values
-
FORMPARSESTATE_IN_TEXTAREA
protected static final int FORMPARSESTATE_IN_TEXTAREA
- See Also:
- Constant Field Values
-
FORMPARSESTATE_IN_OPTION
protected static final int FORMPARSESTATE_IN_OPTION
- See Also:
- Constant Field Values
-
formParseState
protected int formParseState
-
selectName
protected java.lang.String selectName
-
selectMultiple
protected java.lang.String selectMultiple
-
optionValue
protected java.lang.String optionValue
-
optionSelected
protected java.lang.String optionSelected
-
optionValueText
protected java.lang.StringBuilder optionValueText
-
-
Constructor Detail
-
FormParseState
public FormParseState(IHTMLHandler handler)
-
-
Method Detail
-
noteNonscriptTag
protected boolean noteNonscriptTag(java.lang.String tagName, java.util.Map<java.lang.String,java.lang.String> attributes) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException- Overrides:
noteNonscriptTagin classLinkParseState- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
noteNonscriptEndTag
protected boolean noteNonscriptEndTag(java.lang.String tagName) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException- Overrides:
noteNonscriptEndTagin classScriptParseState- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
noteNormalCharacter
protected boolean noteNormalCharacter(char thisChar) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException- Overrides:
noteNormalCharacterin classorg.apache.manifoldcf.connectorcommon.fuzzyml.TagParseState- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
-