Class WikiConnector
- java.lang.Object
-
- org.apache.manifoldcf.core.connector.BaseConnector
-
- org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector
-
- org.apache.manifoldcf.crawler.connectors.wiki.WikiConnector
-
- All Implemented Interfaces:
org.apache.manifoldcf.core.interfaces.IConnector,org.apache.manifoldcf.crawler.interfaces.IRepositoryConnector
public class WikiConnector extends org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnectorThis is the repository connector for a wiki.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description protected static classWikiConnector.APILoginResultprotected classWikiConnector.ExecuteAPILoginThreadThread to execute a "login" operation.protected static classWikiConnector.ExecuteCheckThreadThread to execute a "check" operation.protected static classWikiConnector.ExecuteGetDocInfoThreadThread to execute a "get doc info" operation.protected static classWikiConnector.ExecuteGetDocURLsThreadThread to execute a "get timestamp" operation.protected static classWikiConnector.ExecuteGetNamespacesThreadThread to execute a "get namespaces" operation.protected static classWikiConnector.ExecuteGetTimestampThreadThread to execute a "get timestamp" operation.protected static classWikiConnector.ExecuteListPagesThreadThread to execute a list pages operationprotected classWikiConnector.ExecuteTokenAPILoginThreadThread to finish a "login" operation.protected static classWikiConnector.ReturnStringprotected static classWikiConnector.WikiCheckAllPagesContextClass recognizing the "api/query/allpages" context of a "check" responseprotected static classWikiConnector.WikiCheckAPIContextClass representing the "api" context of a "check" responseprotected static classWikiConnector.WikiCheckPContextClass representing the "api/query/allpages/p" context of a "check" responseprotected static classWikiConnector.WikiCheckQueryContextClass representing the "api/query" context of a "check" responseprotected static classWikiConnector.WikiGetDocInfoAPIContextClass representing the "api" context of a "get doc info" responseprotected static classWikiConnector.WikiGetDocInfoPageContextClass representing the "api/query/pages/page" context of a "get doc info" responseprotected static classWikiConnector.WikiGetDocInfoPagesContextClass representing the "api/query/pages" context of a "get doc info" responseprotected static classWikiConnector.WikiGetDocInfoQueryContextClass representing the "api/query" context of a "get doc info" responseprotected static classWikiConnector.WikiGetDocInfoRevContextClass looking for the "api/query/pages/page/revisions/rev" context of a "get doc info" responseprotected static classWikiConnector.WikiGetDocInfoRevisionsContextClass representing the "api/query/pages/page/revisions" context of a "get doc info" responseprotected static classWikiConnector.WikiGetDocURLsAPIContextClass representing the "api" context of a "get timestamp" responseprotected static classWikiConnector.WikiGetDocURLsPageContextClass looking for the "api/query/pages/page" context of a "get timestamp" responseprotected static classWikiConnector.WikiGetDocURLsPagesContextClass looking for the "api/query/pages" context of a "get timestamp" responseprotected static classWikiConnector.WikiGetDocURLsQueryContextClass representing the "api/query" context of a "get timestamp" responseprotected static classWikiConnector.WikiGetNamespacesAPIContextClass representing the "api" context of a "get namespaces" responseprotected static classWikiConnector.WikiGetNamespacesNamespacesContextClass representing the "api/query/namespaces" context of a "get namespaces" responseprotected static classWikiConnector.WikiGetNamespacesNsContextClass representing the "api/query/pages/page" context of a "get doc info" responseprotected static classWikiConnector.WikiGetNamespacesQueryContextClass representing the "api/query" context of a "get namespaces" responseprotected static classWikiConnector.WikiGetTimestampAPIContextClass representing the "api" context of a "get timestamp" responseprotected static classWikiConnector.WikiGetTimestampPageContextClass looking for the "api/query/pages/page" context of a "get timestamp" responseprotected static classWikiConnector.WikiGetTimestampPagesContextClass looking for the "api/query/pages" context of a "get timestamp" responseprotected static classWikiConnector.WikiGetTimestampQueryContextClass representing the "api/query" context of a "get timestamp" responseprotected static classWikiConnector.WikiGetTimestampRevContextClass looking for the "api/query/pages/page/revisions/rev" context of a "get timestamp" responseprotected static classWikiConnector.WikiGetTimestampRevisionsContextClass looking for the "api/query/pages/page/revisions" context of a "get timestamp" responseprotected static classWikiConnector.WikiListPagesAllPagesContextClass recognizing the "api/query/allpages" context of a "list all pages" responseprotected static classWikiConnector.WikiListPagesAPIContextClass representing the "api" context of a "list all pages" responseprotected static classWikiConnector.WikiListPagesPContextClass representing the "api/query/allpages/p" context of a "list all pages" responseprotected static classWikiConnector.WikiListPagesQueryContextClass representing the "api/query" context of a "list all pages" responseprotected classWikiConnector.WikiLoginAPIContextClass representing the "api" context of a "login" responseprotected classWikiConnector.WikiLoginAPIResultAPIContextClass representing the "api/result" context of a "login" responseprotected classWikiConnector.WikiTokenLoginAPIContextClass representing the "api" context of a "login" responseprotected classWikiConnector.WikiTokenLoginAPIResultAPIContextClass representing the "api/result" context of a "login" response
-
Field Summary
Fields Modifier and Type Field Description static java.lang.String_rcsidprotected java.lang.StringaccessPasswordprotected java.lang.StringaccessRealmprotected java.lang.StringaccessUserprotected static java.lang.String[]activitiesListActivities listprotected static java.lang.StringACTIVITY_FETCHFetch activityprotected java.lang.StringbaseURLBase URLprotected org.apache.http.conn.HttpClientConnectionManagerconnectionManagerConnection managementprotected booleanhasBeenSetupHas setup been called?protected org.apache.http.client.HttpClienthttpClientprotected java.lang.StringproxyDomainprotected java.lang.StringproxyHostprotected java.lang.StringproxyPasswordprotected java.lang.StringproxyPortprotected java.lang.StringproxyUsernameprotected java.lang.StringserverServer nameprotected java.lang.StringserverDomainprotected java.lang.StringserverLoginprotected java.lang.StringserverPassprotected java.lang.StringuserAgentThe user-agent for this connector instance-
Fields inherited from class org.apache.manifoldcf.core.connector.BaseConnector
currentContext, params
-
Fields inherited from interface org.apache.manifoldcf.crawler.interfaces.IRepositoryConnector
GLOBAL_DENY_TOKEN, JOBMODE_CONTINUOUS, JOBMODE_ONCEONLY, MODEL_ADD, MODEL_ADD_CHANGE, MODEL_ADD_CHANGE_DELETE, MODEL_ALL, MODEL_CHAINED_ADD, MODEL_CHAINED_ADD_CHANGE, MODEL_CHAINED_ADD_CHANGE_DELETE, MODEL_PARTIAL
-
-
Constructor Summary
Constructors Constructor Description WikiConnector()Constructor.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description java.lang.StringaddSeedDocuments(org.apache.manifoldcf.crawler.interfaces.ISeedingActivity activities, org.apache.manifoldcf.core.interfaces.Specification spec, java.lang.String lastSeedVersion, long seedTime, int jobMode)Queue "seed" documents.java.lang.Stringcheck()Check status of connection.voidconnect(org.apache.manifoldcf.core.interfaces.ConfigParams configParameters)Connect.voiddisconnect()Close the connection.protected java.lang.StringexecuteListPagesViaThread(java.lang.String startPageTitle, java.lang.String namespace, java.lang.String prefix, org.apache.manifoldcf.crawler.interfaces.ISeedingActivity activities)Execute a listPages() operation via a thread.protected static java.lang.String[]getAcls(org.apache.manifoldcf.core.interfaces.Specification spec)Grab forced acl out of document specification.java.lang.String[]getActivitiesList()List the activities we might report on.java.lang.String[]getBinNames(java.lang.String documentIdentifier)For any given document, list the bins that it is a member of.protected java.lang.StringgetCheckURL()Get a URL for a check operation.protected voidgetDocInfo(java.lang.String documentIdentifier, java.lang.String documentVersion, java.lang.String fullURL, org.apache.manifoldcf.crawler.interfaces.IProcessActivity activities, java.lang.String[] allowACL)Get document info and index the document.protected voidgetDocURLs(java.lang.String[] documentIdentifiers, java.util.Map<java.lang.String,java.lang.String> urls)protected java.lang.StringgetGetDocInfoURL(java.lang.String documentIdentifier)Create a URL to obtain a page's metadata and content, given the page ID.protected java.lang.StringgetGetDocURLsURL(java.lang.String[] documentIdentifiers)Create a URL to obtain multiple page's urls, given the page IDs.protected java.lang.StringgetGetNamespacesURL()Create a URL to obtain the namespaces.protected java.lang.StringgetGetTimestampURL(java.lang.String[] documentIdentifiers)Create a URL to obtain multiple page's timestamps, given the page IDs.protected org.apache.http.client.methods.HttpRequestBasegetInitializedGetMethod(java.lang.String URL)Create and initialize an HttpRequestBaseprotected org.apache.http.client.methods.HttpRequestBasegetInitializedPostMethod(java.lang.String URL, java.util.Map<java.lang.String,java.lang.String> params)Create an initialize a post methodprotected java.lang.StringgetListPagesURL(java.lang.String startingTitle, java.lang.String namespace, java.lang.String prefix)Create a URL to obtain the next 500 pages.intgetMaxDocumentRequest()Get the maximum number of documents to amalgamate together into one batch, for this connector.protected voidgetNamespaces(java.util.Map<java.lang.String,java.lang.String> namespaces)Obtain the set of namespaces, as a map keyed by the canonical namespace name where the value is the descriptive name.protected voidgetSession()protected voidgetTimestamps(java.lang.String[] documentIdentifiers, java.util.Map<java.lang.String,java.lang.String> versions, org.apache.manifoldcf.crawler.interfaces.IProcessActivity activities)Obtain document versions for a set of documents.protected static voidhandleException(java.lang.Throwable thr)protected voidlistAllPages(org.apache.manifoldcf.crawler.interfaces.ISeedingActivity activities, java.lang.String namespace, java.lang.String prefix, long startTime, long endTime)Perform a series of listPages() operations, so that we fully obtain the documents we're looking for even though we're limited to 500 of them per request.protected booleanloginToAPI()Log in via the Wiki API.voidoutputConfigurationBody(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext, org.apache.manifoldcf.core.interfaces.IHTTPOutput out, java.util.Locale locale, org.apache.manifoldcf.core.interfaces.ConfigParams parameters, java.lang.String tabName)Output the configuration body section.voidoutputConfigurationHeader(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext, org.apache.manifoldcf.core.interfaces.IHTTPOutput out, java.util.Locale locale, org.apache.manifoldcf.core.interfaces.ConfigParams parameters, java.util.List<java.lang.String> tabsArray)Output the configuration header section.voidoutputSpecificationBody(org.apache.manifoldcf.core.interfaces.IHTTPOutput out, java.util.Locale locale, org.apache.manifoldcf.core.interfaces.Specification ds, int connectionSequenceNumber, int actualSequenceNumber, java.lang.String tabName)Output the specification body section.voidoutputSpecificationHeader(org.apache.manifoldcf.core.interfaces.IHTTPOutput out, java.util.Locale locale, org.apache.manifoldcf.core.interfaces.Specification ds, int connectionSequenceNumber, java.util.List<java.lang.String> tabsArray)Output the specification header section.protected static booleanparseCheckResponse(java.io.InputStream is)Parse check response, e.g.:protected static booleanparseGetDocURLsResponse(java.io.InputStream is, java.util.Map<java.lang.String,java.lang.String> urls)This method parses a response like the following:protected static booleanparseGetTimestampResponse(java.io.InputStream is, java.util.Map<java.lang.String,java.lang.String> versions)This method parses a response like the following:protected static booleanparseListPagesResponse(java.io.InputStream is, org.apache.manifoldcf.connectorcommon.common.XThreadStringBuffer buffer, java.lang.String startPageTitle, WikiConnector.ReturnString lastTitle)Parse list output, e.g.:protected voidperformCheck()Do the check operation.voidpoll()This method is periodically called for all connectors that are connected but not in active use.java.lang.StringprocessConfigurationPost(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext, org.apache.manifoldcf.core.interfaces.IPostParameters variableContext, java.util.Locale locale, org.apache.manifoldcf.core.interfaces.ConfigParams parameters)Process a configuration post.voidprocessDocuments(java.lang.String[] documentIdentifiers, org.apache.manifoldcf.crawler.interfaces.IExistingVersions statuses, org.apache.manifoldcf.core.interfaces.Specification spec, org.apache.manifoldcf.crawler.interfaces.IProcessActivity activities, int jobMode, boolean usesDefaultAuthority)Process a set of documents.java.lang.StringprocessSpecificationPost(org.apache.manifoldcf.core.interfaces.IPostParameters variableContext, java.util.Locale locale, org.apache.manifoldcf.core.interfaces.Specification ds, int connectionSequenceNumber)Process a specification post.protected static java.lang.StringreadResponseAsString(org.apache.http.HttpResponse httpResponse)voidviewConfiguration(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext, org.apache.manifoldcf.core.interfaces.IHTTPOutput out, java.util.Locale locale, org.apache.manifoldcf.core.interfaces.ConfigParams parameters)View configuration.voidviewSpecification(org.apache.manifoldcf.core.interfaces.IHTTPOutput out, java.util.Locale locale, org.apache.manifoldcf.core.interfaces.Specification ds, int connectionSequenceNumber)View specification.-
Methods inherited from class org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector
getConnectorModel, getFormCheckJavascriptMethodName, getFormPresaveCheckJavascriptMethodName, getRelationshipTypes, requestInfo
-
Methods inherited from class org.apache.manifoldcf.core.connector.BaseConnector
clearThreadContext, deinstall, getConfiguration, install, isConnected, outputConfigurationBody, outputConfigurationHeader, outputConfigurationHeader, pack, packFixedList, packList, packList, processConfigurationPost, setThreadContext, unpack, unpackFixedList, unpackList, viewConfiguration
-
-
-
-
Field Detail
-
_rcsid
public static final java.lang.String _rcsid
- See Also:
- Constant Field Values
-
ACTIVITY_FETCH
protected static final java.lang.String ACTIVITY_FETCH
Fetch activity- See Also:
- Constant Field Values
-
activitiesList
protected static final java.lang.String[] activitiesList
Activities list
-
hasBeenSetup
protected boolean hasBeenSetup
Has setup been called?
-
server
protected java.lang.String server
Server name
-
baseURL
protected java.lang.String baseURL
Base URL
-
userAgent
protected java.lang.String userAgent
The user-agent for this connector instance
-
serverLogin
protected java.lang.String serverLogin
-
serverPass
protected java.lang.String serverPass
-
serverDomain
protected java.lang.String serverDomain
-
accessRealm
protected java.lang.String accessRealm
-
accessUser
protected java.lang.String accessUser
-
accessPassword
protected java.lang.String accessPassword
-
proxyHost
protected java.lang.String proxyHost
-
proxyPort
protected java.lang.String proxyPort
-
proxyDomain
protected java.lang.String proxyDomain
-
proxyUsername
protected java.lang.String proxyUsername
-
proxyPassword
protected java.lang.String proxyPassword
-
connectionManager
protected org.apache.http.conn.HttpClientConnectionManager connectionManager
Connection management
-
httpClient
protected org.apache.http.client.HttpClient httpClient
-
-
Method Detail
-
getActivitiesList
public java.lang.String[] getActivitiesList()
List the activities we might report on.- Specified by:
getActivitiesListin interfaceorg.apache.manifoldcf.crawler.interfaces.IRepositoryConnector- Overrides:
getActivitiesListin classorg.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector
-
getBinNames
public java.lang.String[] getBinNames(java.lang.String documentIdentifier)
For any given document, list the bins that it is a member of.- Specified by:
getBinNamesin interfaceorg.apache.manifoldcf.crawler.interfaces.IRepositoryConnector- Overrides:
getBinNamesin classorg.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector
-
connect
public void connect(org.apache.manifoldcf.core.interfaces.ConfigParams configParameters)
Connect.- Specified by:
connectin interfaceorg.apache.manifoldcf.core.interfaces.IConnector- Overrides:
connectin classorg.apache.manifoldcf.core.connector.BaseConnector- Parameters:
configParameters- is the set of configuration parameters, which in this case describe the target appliance, basic auth configuration, etc.
-
getSession
protected void getSession() throws org.apache.manifoldcf.core.interfaces.ManifoldCFException, org.apache.manifoldcf.agents.interfaces.ServiceInterruption- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionorg.apache.manifoldcf.agents.interfaces.ServiceInterruption
-
loginToAPI
protected boolean loginToAPI() throws org.apache.manifoldcf.core.interfaces.ManifoldCFException, org.apache.manifoldcf.agents.interfaces.ServiceInterruptionLog in via the Wiki API. Call this method whenever login is apparently needed.- Returns:
- true if the login was successful, false otherwise.
- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionorg.apache.manifoldcf.agents.interfaces.ServiceInterruption
-
check
public java.lang.String check() throws org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionCheck status of connection.- Specified by:
checkin interfaceorg.apache.manifoldcf.core.interfaces.IConnector- Overrides:
checkin classorg.apache.manifoldcf.core.connector.BaseConnector- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
poll
public void poll() throws org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionThis method is periodically called for all connectors that are connected but not in active use.- Specified by:
pollin interfaceorg.apache.manifoldcf.core.interfaces.IConnector- Overrides:
pollin classorg.apache.manifoldcf.core.connector.BaseConnector- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
disconnect
public void disconnect() throws org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionClose the connection. Call this before discarding the connection.- Specified by:
disconnectin interfaceorg.apache.manifoldcf.core.interfaces.IConnector- Overrides:
disconnectin classorg.apache.manifoldcf.core.connector.BaseConnector- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
getMaxDocumentRequest
public int getMaxDocumentRequest()
Get the maximum number of documents to amalgamate together into one batch, for this connector.- Specified by:
getMaxDocumentRequestin interfaceorg.apache.manifoldcf.crawler.interfaces.IRepositoryConnector- Overrides:
getMaxDocumentRequestin classorg.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector- Returns:
- the maximum number. 0 indicates "unlimited".
-
addSeedDocuments
public java.lang.String addSeedDocuments(org.apache.manifoldcf.crawler.interfaces.ISeedingActivity activities, org.apache.manifoldcf.core.interfaces.Specification spec, java.lang.String lastSeedVersion, long seedTime, int jobMode) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException, org.apache.manifoldcf.agents.interfaces.ServiceInterruptionQueue "seed" documents. Seed documents are the starting places for crawling activity. Documents are seeded when this method calls appropriate methods in the passed in ISeedingActivity object. This method can choose to find repository changes that happen only during the specified time interval. The seeds recorded by this method will be viewed by the framework based on what the getConnectorModel() method returns. It is not a big problem if the connector chooses to create more seeds than are strictly necessary; it is merely a question of overall work required. The end time and seeding version string passed to this method may be interpreted for greatest efficiency. For continuous crawling jobs, this method will be called once, when the job starts, and at various periodic intervals as the job executes. When a job's specification is changed, the framework automatically resets the seeding version string to null. The seeding version string may also be set to null on each job run, depending on the connector model returned by getConnectorModel(). Note that it is always ok to send MORE documents rather than less to this method. The connector will be connected before this method can be called.- Specified by:
addSeedDocumentsin interfaceorg.apache.manifoldcf.crawler.interfaces.IRepositoryConnector- Overrides:
addSeedDocumentsin classorg.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector- Parameters:
activities- is the interface this method should use to perform whatever framework actions are desired.spec- is a document specification (that comes from the job).seedTime- is the end of the time range of documents to consider, exclusive.lastSeedVersion- is the last seeding version string for this job, or null if the job has no previous seeding version string.jobMode- is an integer describing how the job is being run, whether continuous or once-only.- Returns:
- an updated seeding version string, to be stored with the job.
- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionorg.apache.manifoldcf.agents.interfaces.ServiceInterruption
-
processDocuments
public void processDocuments(java.lang.String[] documentIdentifiers, org.apache.manifoldcf.crawler.interfaces.IExistingVersions statuses, org.apache.manifoldcf.core.interfaces.Specification spec, org.apache.manifoldcf.crawler.interfaces.IProcessActivity activities, int jobMode, boolean usesDefaultAuthority) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException, org.apache.manifoldcf.agents.interfaces.ServiceInterruptionProcess a set of documents. This is the method that should cause each document to be fetched, processed, and the results either added to the queue of documents for the current job, and/or entered into the incremental ingestion manager. The document specification allows this class to filter what is done based on the job. The connector will be connected before this method can be called.- Specified by:
processDocumentsin interfaceorg.apache.manifoldcf.crawler.interfaces.IRepositoryConnector- Overrides:
processDocumentsin classorg.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector- Parameters:
documentIdentifiers- is the set of document identifiers to process.statuses- are the currently-stored document versions for each document in the set of document identifiers passed in above.activities- is the interface this method should use to queue up new document references and ingest documents.jobMode- is an integer describing how the job is being run, whether continuous or once-only.usesDefaultAuthority- will be true only if the authority in use for these documents is the default one.- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionorg.apache.manifoldcf.agents.interfaces.ServiceInterruption
-
getAcls
protected static java.lang.String[] getAcls(org.apache.manifoldcf.core.interfaces.Specification spec)
Grab forced acl out of document specification.- Parameters:
spec- is the document specification.- Returns:
- the acls.
-
outputConfigurationHeader
public void outputConfigurationHeader(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext, org.apache.manifoldcf.core.interfaces.IHTTPOutput out, java.util.Locale locale, org.apache.manifoldcf.core.interfaces.ConfigParams parameters, java.util.List<java.lang.String> tabsArray) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException, java.io.IOExceptionOutput the configuration header section. This method is called in the head section of the connector's configuration page. Its purpose is to add the required tabs to the list, and to output any javascript methods that might be needed by the configuration editing HTML.- Specified by:
outputConfigurationHeaderin interfaceorg.apache.manifoldcf.core.interfaces.IConnector- Overrides:
outputConfigurationHeaderin classorg.apache.manifoldcf.core.connector.BaseConnector- Parameters:
threadContext- is the local thread context.out- is the output to which any HTML should be sent.parameters- are the configuration parameters, as they currently exist, for this connection being configured.tabsArray- is an array of tab names. Add to this array any tab names that are specific to the connector.- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionjava.io.IOException
-
outputConfigurationBody
public void outputConfigurationBody(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext, org.apache.manifoldcf.core.interfaces.IHTTPOutput out, java.util.Locale locale, org.apache.manifoldcf.core.interfaces.ConfigParams parameters, java.lang.String tabName) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException, java.io.IOExceptionOutput the configuration body section. This method is called in the body section of the connector's configuration page. Its purpose is to present the required form elements for editing. The coder can presume that the HTML that is output from this configuration will be within appropriate <html>, <body>, and <form> tags. The name of the form is "editconnection".- Specified by:
outputConfigurationBodyin interfaceorg.apache.manifoldcf.core.interfaces.IConnector- Overrides:
outputConfigurationBodyin classorg.apache.manifoldcf.core.connector.BaseConnector- Parameters:
threadContext- is the local thread context.out- is the output to which any HTML should be sent.parameters- are the configuration parameters, as they currently exist, for this connection being configured.tabName- is the current tab name.- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionjava.io.IOException
-
processConfigurationPost
public java.lang.String processConfigurationPost(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext, org.apache.manifoldcf.core.interfaces.IPostParameters variableContext, java.util.Locale locale, org.apache.manifoldcf.core.interfaces.ConfigParams parameters) throws org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionProcess a configuration post. This method is called at the start of the connector's configuration page, whenever there is a possibility that form data for a connection has been posted. Its purpose is to gather form information and modify the configuration parameters accordingly. The name of the posted form is "editconnection".- Specified by:
processConfigurationPostin interfaceorg.apache.manifoldcf.core.interfaces.IConnector- Overrides:
processConfigurationPostin classorg.apache.manifoldcf.core.connector.BaseConnector- Parameters:
threadContext- is the local thread context.variableContext- is the set of variables available from the post, including binary file post information.parameters- are the configuration parameters, as they currently exist, for this connection being configured.- Returns:
- null if all is well, or a string error message if there is an error that should prevent saving of the connection (and cause a redirection to an error page).
- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
viewConfiguration
public void viewConfiguration(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext, org.apache.manifoldcf.core.interfaces.IHTTPOutput out, java.util.Locale locale, org.apache.manifoldcf.core.interfaces.ConfigParams parameters) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException, java.io.IOExceptionView configuration. This method is called in the body section of the connector's view configuration page. Its purpose is to present the connection information to the user. The coder can presume that the HTML that is output from this configuration will be within appropriate <html> and <body>tags.- Specified by:
viewConfigurationin interfaceorg.apache.manifoldcf.core.interfaces.IConnector- Overrides:
viewConfigurationin classorg.apache.manifoldcf.core.connector.BaseConnector- Parameters:
threadContext- is the local thread context.out- is the output to which any HTML should be sent.parameters- are the configuration parameters, as they currently exist, for this connection being configured.- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionjava.io.IOException
-
outputSpecificationHeader
public void outputSpecificationHeader(org.apache.manifoldcf.core.interfaces.IHTTPOutput out, java.util.Locale locale, org.apache.manifoldcf.core.interfaces.Specification ds, int connectionSequenceNumber, java.util.List<java.lang.String> tabsArray) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException, java.io.IOExceptionOutput the specification header section. This method is called in the head section of a job page which has selected a repository connection of the current type. Its purpose is to add the required tabs to the list, and to output any javascript methods that might be needed by the job editing HTML. The connector will be connected before this method can be called.- Specified by:
outputSpecificationHeaderin interfaceorg.apache.manifoldcf.crawler.interfaces.IRepositoryConnector- Overrides:
outputSpecificationHeaderin classorg.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector- Parameters:
out- is the output to which any HTML should be sent.locale- is the locale the output is preferred to be in.ds- is the current document specification for this job.connectionSequenceNumber- is the unique number of this connection within the job.tabsArray- is an array of tab names. Add to this array any tab names that are specific to the connector.- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionjava.io.IOException
-
outputSpecificationBody
public void outputSpecificationBody(org.apache.manifoldcf.core.interfaces.IHTTPOutput out, java.util.Locale locale, org.apache.manifoldcf.core.interfaces.Specification ds, int connectionSequenceNumber, int actualSequenceNumber, java.lang.String tabName) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException, java.io.IOExceptionOutput the specification body section. This method is called in the body section of a job page which has selected a repository connection of the current type. Its purpose is to present the required form elements for editing. The coder can presume that the HTML that is output from this configuration will be within appropriate <html>, <body>, and <form> tags. The name of the form is always "editjob". The connector will be connected before this method can be called.- Specified by:
outputSpecificationBodyin interfaceorg.apache.manifoldcf.crawler.interfaces.IRepositoryConnector- Overrides:
outputSpecificationBodyin classorg.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector- Parameters:
out- is the output to which any HTML should be sent.locale- is the locale the output is preferred to be in.ds- is the current document specification for this job.connectionSequenceNumber- is the unique number of this connection within the job.actualSequenceNumber- is the connection within the job that has currently been selected.tabName- is the current tab name. (actualSequenceNumber, tabName) form a unique tuple within the job.- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionjava.io.IOException
-
processSpecificationPost
public java.lang.String processSpecificationPost(org.apache.manifoldcf.core.interfaces.IPostParameters variableContext, java.util.Locale locale, org.apache.manifoldcf.core.interfaces.Specification ds, int connectionSequenceNumber) throws org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionProcess a specification post. This method is called at the start of job's edit or view page, whenever there is a possibility that form data for a connection has been posted. Its purpose is to gather form information and modify the document specification accordingly. The name of the posted form is always "editjob". The connector will be connected before this method can be called.- Specified by:
processSpecificationPostin interfaceorg.apache.manifoldcf.crawler.interfaces.IRepositoryConnector- Overrides:
processSpecificationPostin classorg.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector- Parameters:
variableContext- contains the post data, including binary file-upload information.locale- is the locale the output is preferred to be in.ds- is the current document specification for this job.connectionSequenceNumber- is the unique number of this connection within the job.- Returns:
- null if all is well, or a string error message if there is an error that should prevent saving of the job (and cause a redirection to an error page).
- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
viewSpecification
public void viewSpecification(org.apache.manifoldcf.core.interfaces.IHTTPOutput out, java.util.Locale locale, org.apache.manifoldcf.core.interfaces.Specification ds, int connectionSequenceNumber) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException, java.io.IOExceptionView specification. This method is called in the body section of a job's view page. Its purpose is to present the document specification information to the user. The coder can presume that the HTML that is output from this configuration will be within appropriate <html> and <body>tags. The connector will be connected before this method can be called.- Specified by:
viewSpecificationin interfaceorg.apache.manifoldcf.crawler.interfaces.IRepositoryConnector- Overrides:
viewSpecificationin classorg.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector- Parameters:
out- is the output to which any HTML should be sent.locale- is the locale the output is preferred to be in.ds- is the current document specification for this job.connectionSequenceNumber- is the unique number of this connection within the job.- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionjava.io.IOException
-
getInitializedGetMethod
protected org.apache.http.client.methods.HttpRequestBase getInitializedGetMethod(java.lang.String URL) throws java.io.IOExceptionCreate and initialize an HttpRequestBase- Throws:
java.io.IOException
-
getInitializedPostMethod
protected org.apache.http.client.methods.HttpRequestBase getInitializedPostMethod(java.lang.String URL, java.util.Map<java.lang.String,java.lang.String> params) throws java.io.IOExceptionCreate an initialize a post method- Throws:
java.io.IOException
-
performCheck
protected void performCheck() throws org.apache.manifoldcf.core.interfaces.ManifoldCFException, org.apache.manifoldcf.agents.interfaces.ServiceInterruptionDo the check operation. This throws an exception if anything is wrong.- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionorg.apache.manifoldcf.agents.interfaces.ServiceInterruption
-
getCheckURL
protected java.lang.String getCheckURL() throws org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionGet a URL for a check operation.- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
parseCheckResponse
protected static boolean parseCheckResponse(java.io.InputStream is) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException, org.apache.manifoldcf.agents.interfaces.ServiceInterruptionParse check response, e.g.:<api xmlns="http://www.mediawiki.org/xml/api/"> <query> <allpages> <p pageid="19839654" ns="0" title="Kre'fey" /> </allpages> </query> <query-continue> <allpages apfrom="Krea" /> </query-continue> </api>- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionorg.apache.manifoldcf.agents.interfaces.ServiceInterruption
-
listAllPages
protected void listAllPages(org.apache.manifoldcf.crawler.interfaces.ISeedingActivity activities, java.lang.String namespace, java.lang.String prefix, long startTime, long endTime) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException, org.apache.manifoldcf.agents.interfaces.ServiceInterruptionPerform a series of listPages() operations, so that we fully obtain the documents we're looking for even though we're limited to 500 of them per request.- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionorg.apache.manifoldcf.agents.interfaces.ServiceInterruption
-
executeListPagesViaThread
protected java.lang.String executeListPagesViaThread(java.lang.String startPageTitle, java.lang.String namespace, java.lang.String prefix, org.apache.manifoldcf.crawler.interfaces.ISeedingActivity activities) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException, org.apache.manifoldcf.agents.interfaces.ServiceInterruptionExecute a listPages() operation via a thread. Returns the last page title.- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionorg.apache.manifoldcf.agents.interfaces.ServiceInterruption
-
getListPagesURL
protected java.lang.String getListPagesURL(java.lang.String startingTitle, java.lang.String namespace, java.lang.String prefix) throws org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionCreate a URL to obtain the next 500 pages.- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
parseListPagesResponse
protected static boolean parseListPagesResponse(java.io.InputStream is, org.apache.manifoldcf.connectorcommon.common.XThreadStringBuffer buffer, java.lang.String startPageTitle, WikiConnector.ReturnString lastTitle) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException, org.apache.manifoldcf.agents.interfaces.ServiceInterruptionParse list output, e.g.:<api xmlns="http://www.mediawiki.org/xml/api/"> <query> <allpages> <p pageid="19839654" ns="0" title="Kre'fey" /> <p pageid="30955295" ns="0" title="Kre-O" /> <p pageid="14773725" ns="0" title="Kre8tiveworkz" /> <p pageid="19219017" ns="0" title="Kre M'Baye" /> <p pageid="19319577" ns="0" title="Kre Mbaye" /> </allpages> </query> <query-continue> <allpages apfrom="Krea" /> </query-continue> </api>- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionorg.apache.manifoldcf.agents.interfaces.ServiceInterruption
-
getDocURLs
protected void getDocURLs(java.lang.String[] documentIdentifiers, java.util.Map<java.lang.String,java.lang.String> urls) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException, org.apache.manifoldcf.agents.interfaces.ServiceInterruption- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionorg.apache.manifoldcf.agents.interfaces.ServiceInterruption
-
getGetDocURLsURL
protected java.lang.String getGetDocURLsURL(java.lang.String[] documentIdentifiers) throws org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionCreate a URL to obtain multiple page's urls, given the page IDs.- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
parseGetDocURLsResponse
protected static boolean parseGetDocURLsResponse(java.io.InputStream is, java.util.Map<java.lang.String,java.lang.String> urls) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException, org.apache.manifoldcf.agents.interfaces.ServiceInterruptionThis method parses a response like the following:<api> <query> <pages> <page pageid="27697087" ns="0" title="API" fullurl="..."/> </pages> </query> </api>- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionorg.apache.manifoldcf.agents.interfaces.ServiceInterruption
-
getTimestamps
protected void getTimestamps(java.lang.String[] documentIdentifiers, java.util.Map<java.lang.String,java.lang.String> versions, org.apache.manifoldcf.crawler.interfaces.IProcessActivity activities) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException, org.apache.manifoldcf.agents.interfaces.ServiceInterruptionObtain document versions for a set of documents.- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionorg.apache.manifoldcf.agents.interfaces.ServiceInterruption
-
getGetTimestampURL
protected java.lang.String getGetTimestampURL(java.lang.String[] documentIdentifiers) throws org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionCreate a URL to obtain multiple page's timestamps, given the page IDs.- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
parseGetTimestampResponse
protected static boolean parseGetTimestampResponse(java.io.InputStream is, java.util.Map<java.lang.String,java.lang.String> versions) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException, org.apache.manifoldcf.agents.interfaces.ServiceInterruptionThis method parses a response like the following:<api> <query> <pages> <page pageid="27697087" ns="0" title="API"> <revisions> <rev user="Graham87" timestamp="2010-06-13T08:41:17Z" /> </revisions> </page> </pages> </query> </api>- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionorg.apache.manifoldcf.agents.interfaces.ServiceInterruption
-
getNamespaces
protected void getNamespaces(java.util.Map<java.lang.String,java.lang.String> namespaces) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException, org.apache.manifoldcf.agents.interfaces.ServiceInterruptionObtain the set of namespaces, as a map keyed by the canonical namespace name where the value is the descriptive name.- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionorg.apache.manifoldcf.agents.interfaces.ServiceInterruption
-
getGetNamespacesURL
protected java.lang.String getGetNamespacesURL() throws org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionCreate a URL to obtain the namespaces.- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
getDocInfo
protected void getDocInfo(java.lang.String documentIdentifier, java.lang.String documentVersion, java.lang.String fullURL, org.apache.manifoldcf.crawler.interfaces.IProcessActivity activities, java.lang.String[] allowACL) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException, org.apache.manifoldcf.agents.interfaces.ServiceInterruptionGet document info and index the document.- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionorg.apache.manifoldcf.agents.interfaces.ServiceInterruption
-
getGetDocInfoURL
protected java.lang.String getGetDocInfoURL(java.lang.String documentIdentifier) throws org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionCreate a URL to obtain a page's metadata and content, given the page ID. QUESTION: Can we do multiple document identifiers at a time??- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
readResponseAsString
protected static java.lang.String readResponseAsString(org.apache.http.HttpResponse httpResponse) throws java.io.IOException- Throws:
java.io.IOException
-
handleException
protected static void handleException(java.lang.Throwable thr) throws java.lang.InterruptedException, org.apache.manifoldcf.core.interfaces.ManifoldCFException, org.apache.manifoldcf.agents.interfaces.ServiceInterruption, java.io.IOException, org.apache.http.HttpException- Throws:
java.lang.InterruptedExceptionorg.apache.manifoldcf.core.interfaces.ManifoldCFExceptionorg.apache.manifoldcf.agents.interfaces.ServiceInterruptionjava.io.IOExceptionorg.apache.http.HttpException
-
-