Class GridFSRepositoryConnector
- java.lang.Object
-
- org.apache.manifoldcf.core.connector.BaseConnector
-
- org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector
-
- org.apache.manifoldcf.crawler.connectors.gridfs.GridFSRepositoryConnector
-
- All Implemented Interfaces:
org.apache.manifoldcf.core.interfaces.IConnector,org.apache.manifoldcf.crawler.interfaces.IRepositoryConnector
public class GridFSRepositoryConnector extends org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector- Author:
- molgun
-
-
Field Summary
Fields Modifier and Type Field Description protected java.lang.StringaclEndpoint acl.protected static java.lang.StringACTIVITY_FETCHActivity name for the activity record.protected java.lang.StringbucketEndpoint bucket.protected java.lang.StringdbEndpoint db.protected java.lang.StringdenyAclEndpoint denyAcl.protected static java.util.HashMapdocumentKnownColumnsSpecial column names, as far as document queries are concernedprotected java.lang.StringhostEndpoint host.protected longlastSessionFetchLast session fetch time.protected java.lang.StringpasswordEndpoint password.protected java.lang.StringportEndpoint port.protected static java.lang.StringSERVERServer name for declaring bin name.protected com.mongodb.DBsessionMongoDB session.protected static longSESSION_EXPIRATION_MILLISECONDSSession expiration milliseconds.protected java.lang.StringurlEndpoint url.protected java.lang.StringusernameEndpoint username.-
Fields inherited from class org.apache.manifoldcf.core.connector.BaseConnector
currentContext, params
-
Fields inherited from interface org.apache.manifoldcf.crawler.interfaces.IRepositoryConnector
GLOBAL_DENY_TOKEN, JOBMODE_CONTINUOUS, JOBMODE_ONCEONLY, MODEL_ADD, MODEL_ADD_CHANGE, MODEL_ADD_CHANGE_DELETE, MODEL_ALL, MODEL_CHAINED_ADD, MODEL_CHAINED_ADD_CHANGE, MODEL_CHAINED_ADD_CHANGE_DELETE, MODEL_PARTIAL
-
-
Constructor Summary
Constructors Constructor Description GridFSRepositoryConnector()Constructer.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description java.lang.StringaddSeedDocuments(org.apache.manifoldcf.crawler.interfaces.ISeedingActivity activities, org.apache.manifoldcf.core.interfaces.Specification spec, java.lang.String lastSeedVersion, long seedTime, int jobMode)Queue "seed" documents.protected voidapplyMetadata(org.apache.manifoldcf.agents.interfaces.RepositoryDocument rd, com.mongodb.DBObject metadataMap)Apply metadata to a repository document.java.lang.Stringcheck()Test the connection.voidconnect(org.apache.manifoldcf.core.interfaces.ConfigParams configParams)Connect.voiddisconnect()Close the connection.voidfillInServerParameters(java.util.Map<java.lang.String,java.lang.String> paramMap, org.apache.manifoldcf.core.interfaces.IPasswordMapperActivity mapper, org.apache.manifoldcf.core.interfaces.ConfigParams parameters)Fill in a Server tab configuration parameter map for calling a Velocity template.java.lang.String[]getActivitiesList()Return the list of activities that this connector supports (i.e.java.lang.String[]getBinNames(java.lang.String documentIdentifier)Tell the world what model this connector uses for addSeedDocuments().intgetConnectorModel()Tell the world what model this connector uses for addSeedDocuments().intgetMaxDocumentRequest()Get the maximum number of documents to amalgamate together into one batch, for this connector.java.lang.String[]getRelationshipTypes()Return the list of relationship types that this connector recognizes.protected voidgetSession()Setup a session.protected static voidhandleIOException(java.io.IOException e)booleanisConnected()This method is called to assess whether to count this connector instance should actually be counted as being connected.voidoutputConfigurationBody(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext, org.apache.manifoldcf.core.interfaces.IHTTPOutput out, java.util.Locale locale, org.apache.manifoldcf.core.interfaces.ConfigParams parameters, java.lang.String tabName)Output the configuration body section.voidoutputConfigurationHeader(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext, org.apache.manifoldcf.core.interfaces.IHTTPOutput out, java.util.Locale locale, org.apache.manifoldcf.core.interfaces.ConfigParams parameters, java.util.List<java.lang.String> tabsArray)Output the configuration header section.voidpoll()This method is periodically called for all connectors that are connected but not in active use.java.lang.StringprocessConfigurationPost(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext, org.apache.manifoldcf.core.interfaces.IPostParameters variableContext, java.util.Locale locale, org.apache.manifoldcf.core.interfaces.ConfigParams parameters)Process a configuration post.voidprocessDocuments(java.lang.String[] documentIdentifiers, org.apache.manifoldcf.crawler.interfaces.IExistingVersions statuses, org.apache.manifoldcf.core.interfaces.Specification spec, org.apache.manifoldcf.crawler.interfaces.IProcessActivity activities, int jobMode, boolean usesDefaultAuthority)Process a set of documents.voidviewConfiguration(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext, org.apache.manifoldcf.core.interfaces.IHTTPOutput out, java.util.Locale locale, org.apache.manifoldcf.core.interfaces.ConfigParams parameters)View configuration.-
Methods inherited from class org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector
getFormCheckJavascriptMethodName, getFormPresaveCheckJavascriptMethodName, outputSpecificationBody, outputSpecificationHeader, processSpecificationPost, requestInfo, viewSpecification
-
Methods inherited from class org.apache.manifoldcf.core.connector.BaseConnector
clearThreadContext, deinstall, getConfiguration, install, outputConfigurationBody, outputConfigurationHeader, outputConfigurationHeader, pack, packFixedList, packList, packList, processConfigurationPost, setThreadContext, unpack, unpackFixedList, unpackList, viewConfiguration
-
-
-
-
Field Detail
-
ACTIVITY_FETCH
protected static final java.lang.String ACTIVITY_FETCH
Activity name for the activity record.- See Also:
- Constant Field Values
-
SERVER
protected static final java.lang.String SERVER
Server name for declaring bin name.- See Also:
- Constant Field Values
-
SESSION_EXPIRATION_MILLISECONDS
protected static final long SESSION_EXPIRATION_MILLISECONDS
Session expiration milliseconds.- See Also:
- Constant Field Values
-
username
protected java.lang.String username
Endpoint username.
-
password
protected java.lang.String password
Endpoint password.
-
host
protected java.lang.String host
Endpoint host.
-
port
protected java.lang.String port
Endpoint port.
-
db
protected java.lang.String db
Endpoint db.
-
bucket
protected java.lang.String bucket
Endpoint bucket.
-
url
protected java.lang.String url
Endpoint url.
-
acl
protected java.lang.String acl
Endpoint acl.
-
denyAcl
protected java.lang.String denyAcl
Endpoint denyAcl.
-
session
protected com.mongodb.DB session
MongoDB session.
-
lastSessionFetch
protected long lastSessionFetch
Last session fetch time.
-
documentKnownColumns
protected static java.util.HashMap documentKnownColumns
Special column names, as far as document queries are concerned
-
-
Method Detail
-
getBinNames
public java.lang.String[] getBinNames(java.lang.String documentIdentifier)
Tell the world what model this connector uses for addSeedDocuments(). This must return a model value as specified above. The connector does not have to be connected for this method to be called.- Specified by:
getBinNamesin interfaceorg.apache.manifoldcf.crawler.interfaces.IRepositoryConnector- Overrides:
getBinNamesin classorg.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector- Returns:
- the model type value.
-
getConnectorModel
public int getConnectorModel()
Tell the world what model this connector uses for addSeedDocuments(). This must return a model value as specified above. The connector does not have to be connected for this method to be called.- Specified by:
getConnectorModelin interfaceorg.apache.manifoldcf.crawler.interfaces.IRepositoryConnector- Overrides:
getConnectorModelin classorg.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector- Returns:
- the model type value.
-
getActivitiesList
public java.lang.String[] getActivitiesList()
Return the list of activities that this connector supports (i.e. writes into the log). The connector does not have to be connected for this method to be called.- Specified by:
getActivitiesListin interfaceorg.apache.manifoldcf.crawler.interfaces.IRepositoryConnector- Overrides:
getActivitiesListin classorg.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector- Returns:
- the list.
-
connect
public void connect(org.apache.manifoldcf.core.interfaces.ConfigParams configParams)
Connect.- Specified by:
connectin interfaceorg.apache.manifoldcf.core.interfaces.IConnector- Overrides:
connectin classorg.apache.manifoldcf.core.connector.BaseConnector- Parameters:
configParams- is the set of configuration parameters, which in this case describe the root directory.
-
check
public java.lang.String check() throws org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionTest the connection. Returns a string describing the connection integrity.- Specified by:
checkin interfaceorg.apache.manifoldcf.core.interfaces.IConnector- Overrides:
checkin classorg.apache.manifoldcf.core.connector.BaseConnector- Returns:
- the connection's status as a displayable string.
- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
disconnect
public void disconnect() throws org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionClose the connection. Call this before discarding this instance of the repository connector.- Specified by:
disconnectin interfaceorg.apache.manifoldcf.core.interfaces.IConnector- Overrides:
disconnectin classorg.apache.manifoldcf.core.connector.BaseConnector- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
poll
public void poll() throws org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionThis method is periodically called for all connectors that are connected but not in active use.- Specified by:
pollin interfaceorg.apache.manifoldcf.core.interfaces.IConnector- Overrides:
pollin classorg.apache.manifoldcf.core.connector.BaseConnector- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
isConnected
public boolean isConnected()
This method is called to assess whether to count this connector instance should actually be counted as being connected.- Specified by:
isConnectedin interfaceorg.apache.manifoldcf.core.interfaces.IConnector- Overrides:
isConnectedin classorg.apache.manifoldcf.core.connector.BaseConnector- Returns:
- true if the connector instance is actually connected.
-
getMaxDocumentRequest
public int getMaxDocumentRequest()
Get the maximum number of documents to amalgamate together into one batch, for this connector.- Specified by:
getMaxDocumentRequestin interfaceorg.apache.manifoldcf.crawler.interfaces.IRepositoryConnector- Overrides:
getMaxDocumentRequestin classorg.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector- Returns:
- the maximum number. 0 indicates "unlimited".
-
getRelationshipTypes
public java.lang.String[] getRelationshipTypes()
Return the list of relationship types that this connector recognizes.- Specified by:
getRelationshipTypesin interfaceorg.apache.manifoldcf.crawler.interfaces.IRepositoryConnector- Overrides:
getRelationshipTypesin classorg.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector- Returns:
- the list.
-
addSeedDocuments
public java.lang.String addSeedDocuments(org.apache.manifoldcf.crawler.interfaces.ISeedingActivity activities, org.apache.manifoldcf.core.interfaces.Specification spec, java.lang.String lastSeedVersion, long seedTime, int jobMode) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException, org.apache.manifoldcf.agents.interfaces.ServiceInterruptionQueue "seed" documents. Seed documents are the starting places for crawling activity. Documents are seeded when this method calls appropriate methods in the passed in ISeedingActivity object. This method can choose to find repository changes that happen only during the specified time interval. The seeds recorded by this method will be viewed by the framework based on what the getConnectorModel() method returns. It is not a big problem if the connector chooses to create more seeds than are strictly necessary; it is merely a question of overall work required. The end time and seeding version string passed to this method may be interpreted for greatest efficiency. For continuous crawling jobs, this method will be called once, when the job starts, and at various periodic intervals as the job executes. When a job's specification is changed, the framework automatically resets the seeding version string to null. The seeding version string may also be set to null on each job run, depending on the connector model returned by getConnectorModel(). Note that it is always ok to send MORE documents rather than less to this method. The connector will be connected before this method can be called.- Specified by:
addSeedDocumentsin interfaceorg.apache.manifoldcf.crawler.interfaces.IRepositoryConnector- Overrides:
addSeedDocumentsin classorg.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector- Parameters:
activities- is the interface this method should use to perform whatever framework actions are desired.spec- is a document specification (that comes from the job).seedTime- is the end of the time range of documents to consider, exclusive.lastSeedVersion- is the last seeding version string for this job, or null if the job has no previous seeding version string.jobMode- is an integer describing how the job is being run, whether continuous or once-only.- Returns:
- an updated seeding version string, to be stored with the job.
- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionorg.apache.manifoldcf.agents.interfaces.ServiceInterruption
-
processDocuments
public void processDocuments(java.lang.String[] documentIdentifiers, org.apache.manifoldcf.crawler.interfaces.IExistingVersions statuses, org.apache.manifoldcf.core.interfaces.Specification spec, org.apache.manifoldcf.crawler.interfaces.IProcessActivity activities, int jobMode, boolean usesDefaultAuthority) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException, org.apache.manifoldcf.agents.interfaces.ServiceInterruptionProcess a set of documents. This is the method that should cause each document to be fetched, processed, and the results either added to the queue of documents for the current job, and/or entered into the incremental ingestion manager. The document specification allows this class to filter what is done based on the job. The connector will be connected before this method can be called.- Specified by:
processDocumentsin interfaceorg.apache.manifoldcf.crawler.interfaces.IRepositoryConnector- Overrides:
processDocumentsin classorg.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector- Parameters:
documentIdentifiers- is the set of document identifiers to process.statuses- are the currently-stored document versions for each document in the set of document identifiers passed in above.activities- is the interface this method should use to queue up new document references and ingest documents.jobMode- is an integer describing how the job is being run, whether continuous or once-only.usesDefaultAuthority- will be true only if the authority in use for these documents is the default one.- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionorg.apache.manifoldcf.agents.interfaces.ServiceInterruption
-
handleIOException
protected static void handleIOException(java.io.IOException e) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException, org.apache.manifoldcf.agents.interfaces.ServiceInterruption- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionorg.apache.manifoldcf.agents.interfaces.ServiceInterruption
-
outputConfigurationHeader
public void outputConfigurationHeader(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext, org.apache.manifoldcf.core.interfaces.IHTTPOutput out, java.util.Locale locale, org.apache.manifoldcf.core.interfaces.ConfigParams parameters, java.util.List<java.lang.String> tabsArray) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException, java.io.IOExceptionOutput the configuration header section. This method is called in the head section of the connector's configuration page. Its purpose is to add the required tabs to the list, and to output any javascript methods that might be needed by the configuration editing HTML. The connector does not need to be connected for this method to be called.- Specified by:
outputConfigurationHeaderin interfaceorg.apache.manifoldcf.core.interfaces.IConnector- Overrides:
outputConfigurationHeaderin classorg.apache.manifoldcf.core.connector.BaseConnector- Parameters:
threadContext- is the local thread context.out- is the output to which any HTML should be sent.parameters- are the configuration parameters, as they currently exist, for this connection being configured.tabsArray- is an array of tab names. Add to this array any tab names that are specific to the connector.- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionjava.io.IOException
-
outputConfigurationBody
public void outputConfigurationBody(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext, org.apache.manifoldcf.core.interfaces.IHTTPOutput out, java.util.Locale locale, org.apache.manifoldcf.core.interfaces.ConfigParams parameters, java.lang.String tabName) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException, java.io.IOExceptionOutput the configuration body section. This method is called in the body section of the connector's configuration page. Its purpose is to present the required form elements for editing. The coder can presume that the HTML that is output from this configuration will be within appropriate <html>, <body>, and <form> tags. The name of the form is always "editconnection". The connector does not need to be connected for this method to be called.- Specified by:
outputConfigurationBodyin interfaceorg.apache.manifoldcf.core.interfaces.IConnector- Overrides:
outputConfigurationBodyin classorg.apache.manifoldcf.core.connector.BaseConnector- Parameters:
threadContext- is the local thread context.out- is the output to which any HTML should be sent.parameters- are the configuration parameters, as they currently exist, for this connection being configured.tabName- is the current tab name.- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionjava.io.IOException
-
processConfigurationPost
public java.lang.String processConfigurationPost(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext, org.apache.manifoldcf.core.interfaces.IPostParameters variableContext, java.util.Locale locale, org.apache.manifoldcf.core.interfaces.ConfigParams parameters) throws org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionProcess a configuration post. This method is called at the start of the connector's configuration page, whenever there is a possibility that form data for a connection has been posted. Its purpose is to gather form information and modify the configuration parameters accordingly. The name of the posted form is always "editconnection". The connector does not need to be connected for this method to be called.- Specified by:
processConfigurationPostin interfaceorg.apache.manifoldcf.core.interfaces.IConnector- Overrides:
processConfigurationPostin classorg.apache.manifoldcf.core.connector.BaseConnector- Parameters:
threadContext- is the local thread context.variableContext- is the set of variables available from the post, including binary file post information.parameters- are the configuration parameters, as they currently exist, for this connection being configured.- Returns:
- null if all is well, or a string error message if there is an error that should prevent saving of the connection (and cause a redirection to an error page).
- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
viewConfiguration
public void viewConfiguration(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext, org.apache.manifoldcf.core.interfaces.IHTTPOutput out, java.util.Locale locale, org.apache.manifoldcf.core.interfaces.ConfigParams parameters) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException, java.io.IOExceptionView configuration. This method is called in the body section of the connector's view configuration page. Its purpose is to present the connection information to the user. The coder can presume that the HTML that is output from this configuration will be within appropriate <html> and <body> tags. The connector does not need to be connected for this method to be called.- Specified by:
viewConfigurationin interfaceorg.apache.manifoldcf.core.interfaces.IConnector- Overrides:
viewConfigurationin classorg.apache.manifoldcf.core.connector.BaseConnector- Parameters:
threadContext- is the local thread context.out- is the output to which any HTML should be sent.parameters- are the configuration parameters, as they currently exist, for this connection being configured.- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionjava.io.IOException
-
getSession
protected void getSession() throws org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionSetup a session.- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
fillInServerParameters
public void fillInServerParameters(java.util.Map<java.lang.String,java.lang.String> paramMap, org.apache.manifoldcf.core.interfaces.IPasswordMapperActivity mapper, org.apache.manifoldcf.core.interfaces.ConfigParams parameters)Fill in a Server tab configuration parameter map for calling a Velocity template.- Parameters:
paramMap- is the map to fill inparameters- is the current set of configuration parameters
-
applyMetadata
protected void applyMetadata(org.apache.manifoldcf.agents.interfaces.RepositoryDocument rd, com.mongodb.DBObject metadataMap) throws org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionApply metadata to a repository document.- Parameters:
rd- is the repository document to apply the metadata to.metadataMap- is the resultset row to use to get the metadata. All non-special columns from this row will be considered to be metadata.- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
-