Are the dimensions correct for the jaccard-based datasets

I noticed that datasets using Jaccard similarity were recently added, namely movie lens and kosarak. I downloaded the HDF5 and extracted the 'train' split for kosarak. However, the shape I get doesn't seem correct. Am I missing something? 

```python 
>>> kosarak = np.load("kosarak-jaccard.train.npy")
>>> kosarak.shape
(4167103,)
>>> 
```

This certainly is very different from what the readme says (74962 vectors each of dimensionality 27983). Would appreciate clarification on this. 




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Are the dimensions correct for the jaccard-based datasets #607

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Are the dimensions correct for the jaccard-based datasets #607

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions