Why should the number of SAP representation protein sequence file lines and the number of Canonical compound SMILE file lines match? 

Hi, 

I am trying to get results of my own data with your model.

(1) According to the file "DeepAffinity_inference.sh", it seems that the number of lines for input protein sequences file and compound file must matches like below. 
![스크린샷, 2022-09-22 10-46-42](https://user-images.githubusercontent.com/22671339/191642157-f5b68ad7-8931-40a4-8dad-a03c605f3c11.png)
Is it mean that the number of each entity in both files have to be matched or literally the the number of lines of both files have to be matched?

(2) I got two files for my own data after following your manual.
Could you tell me if their entities' structure are correct for model input?
  - CID_Smi_Feature: 
![스크린샷, 2022-09-22 11-31-44](https://user-images.githubusercontent.com/22671339/191645315-2bf64916-ddf2-4f84-aad3-2bf7ec4109fc.png)
  - protein_grouped_finalPresentation
![스크린샷, 2022-09-22 11-33-55](https://user-images.githubusercontent.com/22671339/191645554-211a3ee4-1c89-4481-9b31-6d2cd38273cd.png)

Thank you,
CallMeDek



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Why should the number of SAP representation protein sequence file lines and the number of Canonical compound SMILE file lines match? #9

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Why should the number of SAP representation protein sequence file lines and the number of Canonical compound SMILE file lines match? #9

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions