Skip to content

Sparse matrix loading error: "SparseArrays.SparseMatrixCSC{Core.Int8,Core.Int64} does not match currently loaded type" #282

@emsal0

Description

@emsal0

We (me + @grig-guz) are trying to store a (large) CSV dataset into a sparse matrix and subsequently put that in a JLD blob. However we are getting some weird type error when trying to load it again: ERROR: stored type SparseArrays.SparseMatrixCSC{Core.Int8,Core.Int64} does not match currently loaded type

               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.5.3 (2020-11-09)
 _/ |\__'_|_|_|\__'_|  |
|__/                   |

julia> using JLD

julia> load("data_matrices.jld")
┌ Warning: type SparseArrays.SparseMatrixCSC{Core.Int8,Core.Int64} not present in workspace; reconstructing
└ @ JLD ~/.julia/packages/JLD/jeGJb/src/jld_types.jl:697
ERROR: stored type SparseArrays.SparseMatrixCSC{Core.Int8,Core.Int64} does not match currently loaded type
Stacktrace:
 [1] jldatatype(::JLD.JldFile, ::HDF5.HDF5Datatype) at /home/em/.julia/packages/JLD/jeGJb/src/jld_types.jl:723
 [2] read(::JLD.JldDataset) at /home/em/.julia/packages/JLD/jeGJb/src/JLD.jl:367
 [3] read(::JLD.JldFile, ::String) at /home/em/.julia/packages/JLD/jeGJb/src/JLD.jl:343
 [4] #43 at ./array.jl:0 [inlined]
 [5] iterate at ./generator.jl:47 [inlined]
 [6] collect(::Base.Generator{Array{String,1},JLD.var"#43#45"{JLD.JldFile}}) at ./array.jl:686
 [7] (::JLD.var"#42#44")(::JLD.JldFile) at /home/em/.julia/packages/JLD/jeGJb/src/JLD.jl:1246
 [8] jldopen(::JLD.var"#42#44", ::String, ::Vararg{String,N} where N; kws::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at /home/em/.julia/packages/JLD/jeGJb/src/JLD.jl:243
 [9] jldopen(::Function, ::String, ::String) at /home/em/.julia/packages/JLD/jeGJb/src/JLD.jl:241
 [10] load(::FileIO.File{FileIO.DataFormat{:JLD}}) at /home/em/.julia/packages/JLD/jeGJb/src/JLD.jl:1245
 [11] load(::String; options::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at /home/em/.julia/packages/FileIO/wN5rD/src/loadsave.jl:133
 [12] load(::String) at /home/em/.julia/packages/FileIO/wN5rD/src/loadsave.jl:133
 [13] top-level scope at REPL[2]:1
 [14] run_repl(::REPL.AbstractREPL, ::Any) at /build/julia/src/julia-1.5.3/usr/share/julia/stdlib/v1.5/REPL/src/REPL.jl:288

julia>

The following script is how we package it up

import Pkg; Pkg.add("CSV"); Pkg.add("DataFrames"); Pkg.add("JLD")
using CSV
using DataFrames
using SparseArrays
using JLD

# Data is in the format Item-User-Rating-Timestamp
frame = CSV.read("../data/data.csv", header=["item", "user", "rating", "time_stamp"], DataFrame)

unique_items = unique(frame[!, "item"])
unique_users = unique(frame[!, "user"])

# String id to numerical id
items_map = Dict(zip(unique_items, range(1, length=length(unique_items))))
users_map = Dict(zip(unique_users, range(1, length=length(unique_users))))

num_rows = nrow(frame)
I = zeros(Int64, num_rows); J = zeros(Int64, num_rows); V = ones(Int8, num_rows)
V_ratings= convert(Array{Int8,1}, frame[!, "rating"])

for (index, row) in enumerate(eachrow(frame))
    I[index] = items_map[row["item"]]
    J[index] =  users_map[row["user"]]
end

Sigma = sparse(I, J, V)
B = sparse(I, J, V_ratings)

save("../data/data_matrices.jld", "sigma", Sigma, "b", B)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions