TypeError: can't convert cuda:0 device type tensor to numpy

김다린·2024년 7월 22일

에러코드

첫번째 에러 :

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0!

두번째 에러 :

TypeError                                 Traceback (most recent call last)
Cell In[25], line 2
...(생략)
File /opt/conda/lib/python3.10/site-packages/torch/_tensor.py:1030, in Tensor.array(self, dtype)
1028     return handle_torch_function(Tensor.array, (self,), self, dtype=dtype)
1029 if dtype is None:
-> 1030     return self.numpy()
1031 else:
1032     return self.numpy().astype(dtype, copy=False)

TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

실행환경

환경 : Kaggle Notebook
gpu : P100

에러원인

나의 경우는 query searh를 하기 위해 데이터프레임의 cosine similarity 와 tf-idf행렬을 gpu를 사용해 구하려고 하던중 다음과 같은 에러가 발생했다.

기존의 cpu를 사용하는 코드에서 단순히

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

만 추가하거나 일부 코드만 수정했을 경우에 발생하였다.

처음에 떴던 RuntimeError는 텐서들이 서로 다른 장치에서(cpu와 gpu)에서 연산되려고 할때 발생하게 된다.

이럴 경우에는 모델과 임베딩 데이터들도 모두 cpu를 사용하도록 하거나 query_embedding이나 text_embedding 계산과정에서 .to(device)를 추가해 한 device에서 계산되도록 해야 한다.

나같은 경우에는 cosine_similarity는 gpu계산으로 변경하였으나 tf-idf 벡터 계산시에는 cpu에서 계산하여 발생하였다.

해결방법

쿼리 임베딩, 텍스트 임베딩, tf-idf벡터 생성시 실행 장치를 일치 시켜준다.

# 모델을 GPU로 이동 (모델이 있다면)
model.to(device)

# 쿼리 임베딩 생성
query_embedding = encode_text(query).unsqueeze(0).to(device)

# 텍스트 임베딩 생성
text_embedding = encode_text(text).unsqueeze(0).to(device)

# TF-IDF 벡터 생성
tfidf_vectorizer, tfidf_matrix = compute_tfidf(df)
query_vector = tfidf_vectorizer.transform([query])

# TF-IDF 벡터를 CPU 텐서로 변환 (필요시)
query_vector = query_vector.toarray()  # numpy 배열로 변환 후 사용 가능

김다린

한걸음씩 뚜벅뚜벅

이전 포스트

RAG for LLM : Survey 논문리뷰

다음 포스트