Skip to content

diffrent behavior between cpu and cuda #163

@Boltzmachine

Description

@Boltzmachine

when the cuda devices are invisible, my program runs well.
but when the cuda devices are available, it reports the error

Traceback (most recent call last):
  File "/home/boltzmachine/THG/train.py", line 10, in <module>
    from models.simple import Simple, Share
  File "/home/boltzmachine/THG/models/simple.py", line 5, in <module>
    from .GNN import DenseGatedRGCN
  File "/home/boltzmachine/THG/models/GNN.py", line 10, in <module>
    from torch_geometric.nn.inits import glorot, zeros
  File "/home/boltzmachine/miniconda3/envs/THG/lib/python3.7/site-packages/torch_geometric/__init__.py", line 5, in <module>
    import torch_geometric.data
  File "/home/boltzmachine/miniconda3/envs/THG/lib/python3.7/site-packages/torch_geometric/data/__init__.py", line 1, in <module>
    from .data import Data
  File "/home/boltzmachine/miniconda3/envs/THG/lib/python3.7/site-packages/torch_geometric/data/data.py", line 8, in <module>
    from torch_sparse import coalesce, SparseTensor
  File "/home/boltzmachine/miniconda3/envs/THG/lib/python3.7/site-packages/torch_sparse/__init__.py", line 15, in <module>
    f'{library}_{suffix}', [osp.dirname(__file__)]).origin)
AttributeError: 'NoneType' object has no attribute 'origin'

I use torch 1.7.0 and cu101
I uninstalled torch-sparse repeatedly until there's nothing install
and I installed by

pip install torch-sparse -f https://pytorch-geometric.com/whl/torch-1.7.0+cu101.html --no-cache-dir
pip install torch-scatter -f https://pytorch-geometric.com/whl/torch-1.7.0+cu101.html --no-cache-dir
pip install torch-cluster -f https://pytorch-geometric.com/whl/torch-1.7.0+cu101.html --no-cache-dir
pip install torch-spline-conv -f https://pytorch-geometric.com/whl/torch-1.7.0+cu101.html --no-cache-dir
pip --no-cache-dir install torch-geometric

It seems that there's no *.so of cuda in ~/miniconda3/envs/THG/lib/python3.7/site-packages/torch/cuda/

__init__.py      _diag_cpu.so        _metis_cpu.so    _saint_cpu.so   _spspmm_cpu.so   bandwidth.py  convert.py  index_select.py   metis.py   padding.py  rw.py      select.py  storage.py    utils.py
__pycache__      _ego_sample_cpu.so  _relabel_cpu.so  _sample_cpu.so  _version_cpu.so  cat.py        diag.py     masked_select.py  mul.py     permute.py  saint.py   spmm.py    tensor.py
_convert_cpu.so  _hgt_sample_cpu.so  _rw_cpu.so       _spmm_cpu.so    add.py           coalesce.py   eye.py      matmul.py         narrow.py  reduce.py   sample.py  spspmm.py  transpose.py

One possible reason is that I am using a computer cluster. In my local environment, there is no cuda available until I submit a job by slurm

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions