Skip to content

1. How to Use JPlag

Timur Sağlam edited this page Dec 6, 2022 · 28 revisions

JPlag can be used via the Command Line Interface (CLI) or programmatically via the Java API.

Using JPlag via the CLI

JPlag can be used via the Command Line Interface by executing the JAR file.

Example: java -jar jplag.jar path/to/the/submissions

The following arguments can be used to control JPlag:

positional arguments:
  rootDir                Root-directory with submissions to check for plagiarism

named arguments:
  -h, --help             show this help message and exit
  -new NEW [NEW ...]     Root-directory with submissions to check for plagiarism (same as the root directory)
  -old OLD [OLD ...]     Root-directory with prior submissions to compare against
  -l {cpp,csharp,emf,go,java,kotlin,python3,rlang,scala,scheme,swift,text}
                         Select the language to parse the submissions (default: java)
  -bc BC                 Path of  the  directory  containing  the  base  code  (common  framework  used  in  all
                         submissions)
  -t T                   Tunes the comparison sensitivity by adjusting the  minimum token required to be counted
                         as a matching section. A smaller <n>  increases  the sensitivity but might lead to more
                         false-positives
  -n N                   The maximum number of comparisons that will  be  shown  in the generated report, if set
                         to -1 all comparisons will be shown (default: 100)
  -r R                   Name of the directory in which the comparison results will be stored (default: result)

Advanced:
  -d                     Debug parser. Non-parsable files will be stored (default: false)
  -s S                   Look in directories <root-dir>/*/<dir> for programs
  -p P                   comma-separated list of all filename suffixes that are included
  -x X                   All files named in this file will be ignored in the comparison (line-separated list)
  -m M                   Comparison similarity threshold [0.0-1.0]:  All  comparisons  above this threshold will
                         be saved (default: 0.0)

Clustering:
  --cluster-skip         Skips the clustering (default: false)
  --cluster-alg {AGGLOMERATIVE,SPECTRAL}
                         Which clustering algorithm to use. Agglomerative  merges similar submissions bottom up.
                         Spectral clustering is  combined  with  Bayesian  Optimization  to  execute the k-Means
                         clustering  algorithm  multiple   times,   hopefully   finding   a   "good"  clustering
                         automatically. (default: spectral)
  --cluster-metric {AVG,MIN,MAX,INTERSECTION}
                         The metric used for clustering. AVG  is  intersection  over  union, MAX can expose some
                         attempts of obfuscation. (default: MAX)

Note that the legacy CLI is varying slightly.

Using JPlag programmatically

The new API makes it easy to integrate JPlag's plagiarism detection into external Java projects.

Example:

Language language = new de.jplag.java.Language();
Set<File> submissionDirectories = Set.of(new File("/path/to/rootDir"));
File baseCode = new File("/path/to/baseCode");
JPlagOptions options = new JPlagOptions(language, submissionDirectories, Set.of()).withBaseCodeSubmissionDirectory(baseCode);

JPlag jplag = new JPlag(options);
try {
    JPlagResult result = jplag.run();
     
    // Optional
    ReportObjectFactory reportObjectFactory = new ReportObjectFactory();
    reportObjectFactory.createAndSaveReport(result, "/path/to/output");
} catch (ExitException e) {
    // error handling here
}

Report File Generation

After a JPlag run a zipped result report is automatically created. The target location of the report can be specified with the -r flag.

If the -r is not specified, the location defaults result.zip. Specifying the -r flag with a path /path/to/desiredFolder results in the report being created as /path/to/desiredFolder.zip.

Unless there is an error during the zipping process, the report will always be zipped. If the zipping process fails, the report will be available as unzipped under the specified location.

Viewing Reports

The newest version of the report viewer is always accessible at https://jplag.github.io/JPlag/. Simply drop your result.zip folder on the page to start inspecting the results of your JPlag run. Your submissions will neither be uploaded to a server nor stored permanently. They are saved in the application as long as you view them. Once you refresh the page, all information will be erased.

Basic Concepts

This section explains some fundamental concepts about JPlag that make it easier to understand and use.

  • Root directory: This is the directory in which JPlag will scan for submissions.
  • Submissions: Submissions contain the source code that JPlag will parse and compare. They have to be direct children of the root directory and can either be single files or directories.

Example: Single-file submissions

/path/to/root-directory
├── Submission-1.java
├── ...
└── Submission-n.java

Example: Directory submissions

JPlag will read submission directories recursively, so they can contain multiple (nested) source code files.

/path/to/root-directory
├── Submission-1
│   ├── Main.java
│   └── util
│       └── Utils.java
├── ...
└── Submission-n
    ├── Main.java
    └── util
        └── Utils.java

If you want JPlag to scan only one specific subdirectory of the submissions for source code files (e.g. src), can configure that with the argument -S:

/path/to/root-directory
├── Submission-1
│   ├── src                 
│   │   ├── Main.java       # Included
│   │   └── util            
│   │       └── Utils.java  # Included
│   ├── lib                 
│   │   └── Library.java    # Ignored
│   └── Other.java          # Ignored
└── ...

Example: Base Code

The base code is a special kind of submission. It is the template that all other submissions are based on. JPlag will ignore all matches between two submissions, where the matches are also part of the base code. Like any other submission, the base code has to be a single file or directory in the root directory.

/path/to/root-directory
├── BaseCode
│   └── Solution.java
├── Submission-1
│   └── Solution.java
├── ...
└── Submission-n
    └── Solution.java

In this example, students have to solve a given problem by implementing the run method in the template below. Because they are not supposed to modify the main function, it will be identical for each student.

// BaseCode/Solution.java
public class Solution {

    // DO NOT MODIFY
    public static void main(String[] args) {
        Solution solution = new Solution();  
        solution.run();
    }
    
    public void run() {
        // TODO: Implement your solution here.
    }
}

To prevent JPlag from detecting similarities in the main function (and other parts of the template), we can instruct JPlag to ignore matches with the given base code by providing the --bc=<base-code-name> option. The <base-code-name> in the example above is BaseCode.

Relation between root directories and submissions

The following diagram shows all the relations between root directories, submissions, and files:

  • Submissions in new root directories are checked amongst themselves and against submissions from other root directories
  • Submissions in old root directories are only checked against submissions from other new root directories
classDiagram
    direction TB

    Input -->"1..*" RootDirectory : consists of
    RootDirectory
    RootDirectory <|-- NewDirectory: is a
    RootDirectory <|-- OldDirectory : is a
    
    
    RootDirectory --> "1..*" Submission : contains
    Directory --> "1..*" File : contains
    Submission <|-- File : is a
    Submission <|-- Directory : is a
Loading
Clone this wiki locally