Unable to parse the Mainframe copybook which has a COBOL datatype of BBBB which means empty spacesc

## Describe the bug

We are using CoBrix with PySpark and executing it on AWS EMR.
We have the EBCDIC file and it's corresponding copybook in the AWS S3 bucket. While trying to parse the EBCDIC file using the Copybook, we are getting an error.

Error message : 
py4j.protocol.Py4jJavaError : An error occurred while calling o2021.loa : za.co.absa.cobrix.cobol.parser.exceptions.SyntaxErrorException : Syntax error in the copybook at line 29 : Invalid input 'BBBB' at position 29:45

## Code snippet that caused the issue
```
try : 
 file_path = f's3://{s3_bucket}/{ebcdic_file_path}'
 spark.read
   .format("cobol")
   .option("copybook_contents", copybook)
   .option("encoding", ebcdic)
   .option("schema_retention_policy", "collapse_root")
   .option("generate_record_id", True)
   .load(file_path)
except Exception as e:
   log_message = f'spark job failed with error : {e}'
   logging.error(log_message)
  raise e
```

## Expected behavior
We expected the Cobrix to successfully parse the EBCDIC file record column using the Cobybook which has this datatype of 'BBBB'

## Context
PySpark Jar dependencies : 
- cobol-parser_2.12-2.6.7.jar
- hadoop-lzo-0.4.3.jar
- scodec-bits_2.12-1.1.12.jar
- scodec-core_2.12-1.11.4.jar
- spark-cobol_2.12-2.6.7.jar
- Operating system: AWS EMR (Linux Image)

## Copybook (if possible)
```
                    15 EL02-267-COLNAME-A
                      20 EL02-267-COLNAME-B
                                                       PIC X(19).
                      .........
                      .........
                      .........
                      20 EL02-267-COLNAME-C  REDEFINES
                                    EL02-267-COLNAME-D
                                                       PIC 9(06)BBBB. (This is what is causing the issue we suppose)
GP5WHB        20 FILLER                 pic X(285).                      CLEAN-UP
```

Attach a small data file that can help reproduce the issue, if possible : Need to check the feasibility due to confidentiality of the data. Will get back.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unable to parse the Mainframe copybook which has a COBOL datatype of BBBB which means empty spacesc #734

Describe the bug

Code snippet that caused the issue

Expected behavior

Context

Copybook (if possible)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Unable to parse the Mainframe copybook which has a COBOL datatype of BBBB which means empty spacesc #734

Description

Describe the bug

Code snippet that caused the issue

Expected behavior

Context

Copybook (if possible)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions