- 
                Notifications
    You must be signed in to change notification settings 
- Fork 15
Command Line Interface
        fgreg edited this page May 3, 2017 
        ·
        7 revisions
      
    As MUDROD needs at least two types of data (i.e., web logs and metadata) to work, it offers four commands to perform different tasks.
- Log ingest command -dataDir $pathToData -l. This command can ingest, preprocess, and process all your HTTP and FTP logs. Just run this command if you have some new web logs. Please note that HTTP and FTP logs have to be stored in separate subfolders of the logDir. Please refer to https://github.com/mudrod/mudrod/blob/master/core/src/main/resources/config.xml#L27 for details.
- Metadata ingest command -dataDir $pathToData -m. This command can ingest, preprocess, and process all your metadata (in JSON format for now). Just run this command if you need to update the metadata. Internally, it creates a subfolder underdataDirand downloads metadata using the PO.DAAC web services. The subfolder name is "RawMetadata" by default. Please note that you don't need to empty Elastic Search as it does so automatically to make your life easier.
- Full ingest command -dataDir $pathToData -f. It ingests both web logs and metadata, so basically a combination of the above two. Please think twice before using this if you need to avoid log ingest, which takes a relatively long time.
- Processing command -dataDir $pathToData -p. It assumes the logDir contains all of the processed/intermediate results. This is only for the developers that want to set up a local environment.
- 
for brand new ingestion tasks: Just run -dataDir $pathToData -f. Make sure you have logs in place as MUDROD doesn't support automatic log downloading right now.
- 
for incremental/update ingestion tasks: Run -dataDir $pathToData -lif you want to ingest new web logs, or run-dataDir $pathToData -mif you want to update the metadata.
- 
developers: Run -dataDir $pathToData -pto quickly set up a system environment. In this case,$pathToDatamust contain processed/intermediate results not raw log files.