파이썬 동시성 프로그래밍

Notice

Recent Posts

Recent Comments

Link

« 2024/11 »
일	월	화	수	목	금	토
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30

Tags more

Archives

Today

Total

관리 메뉴

HAMA 블로그

파이썬 동시성 프로그래밍 - (7) GPGPU (pyCuda) 본문

Python

파이썬 동시성 프로그래밍 - (7) GPGPU (pyCuda)

[하마] 이승현 (wowlsh93@gmail.com) 2017. 4. 27. 16:08

연재 순서

1. threading
2. Condition & Semaphore
3. Queue
4. multiprocessing
5. 비동기 (gevent)
6. 분산 (celery)
7. GPGPU (PyCUDA)
8. 코루틴,asyncio,async/awit
9. concurrent.future

PyCUDA

설치 ( 우분투 14.04 에 PyCUDA 설치)

1. CUDA 가 제대로 인스톨 되있는지 확인 합니다.

나는 우분투 14.04에 CUDA 7.5 로 설치하였다.

우분투에 CUDA 설치 관련은 여기 참고

2. gcc4.8.4

Ubuntu 14.04의 경우 기본 gcc 버전은 4.8이기 때문에 아래와 같이 따로 업그레이드등을 할 필요가 없다.

As of the CUDA 4.1 release, gcc 4.5 is now supported. gcc 4.6 and 4.7 are unsupported.
As of the CUDA 5.0 release, gcc 4.6 is now supported. gcc 4.7 are unsupported.
As of the CUDA 6.0 release, gcc 4.7 is now supported.
As of the CUDA 7.0 release, gcc 4.8 is fully supported, with 4.9 support on Ubuntu 14.04
As of the CUDA 7.5 release, gcc 4.8 is fully supported, with 4.9 support on Ubuntu 14.04
As of the CUDA 8 release, gcc 5.3 is fully supported on Ubuntu 16.06

3. Boost C++ 라이브러리 설치

$ sudo apt-get install libboost-all-dev

Boost.Python은 2009 년부터 Python 2.x와 3.x를 모두 지원합니다.

개인적으로 위에 명령어로 하면 문제가 생겨서 직접설치

wget -O boost_1_55_0.tar.gz http://sourceforge.net/projects/boost/files/boost/1.55.0/boost_1_55_0.tar.gz/download
tar xzvf boost_1_55_0.tar.gz
cd boost_1_55_0/

Boost's bootstrap setup:

./bootstrap.sh --prefix=/usr/local

Then build it with:

./b2

and eventually install it:

sudo ./b2 install

4. numpy 인스톨

$ sudo apt-get install python-numpy -y 또는 $ sudo pip install numpy

5. PyCUDA 다운로드,인스톨

필요라이브러리들 인스톨

$ sudo apt-get install build-essential python-dev python-setuptools libboost-python-dev libboost-thread-dev -y

Download PyCUDA 다운로드하고 언팩. (2016.1.2 버전)

$ tar xzvf pycuda-2016.1.2.tar.gz

PyCUDA 설정 및 make

$ cd pycuda-2016.1.2

# 파이썬 2.x의 경우

$ ./configure.py --cuda-root=/usr/local/cuda-7.5 --cudadrv-lib-dir=/usr/lib/x86_64-linux-gnu --boost-inc-dir=/usr/include --boost-lib-dir=/usr/lib --boost-python-libname=boost_python --boost-thread-libname=boost_thread --no-use-shipped-boost

$ make -j 4 $ sudo python setup.py install $ sudo pip install .

확인:

1. PyCUDA 0.93 이상을 설치하지 않으려면 --boost-thread-libname = boost_thread-mt를 제거하십시오.

2. cuda가 /usr/local/ cuda-7.5에 설치되어 있지 않으면 --cuda-root = 를 수정하십시오.

3. -lboost_python-mt 또는 -lboost_thread-mt를 찾을 수 없다는 오류가 발생하지만 해당 파일이 / usr / lib 또는 어디에 있든 알 수 있다면 해당 파일에 대한 심볼릭 링크가 있는지 확인하십시오. 예를 들어, Ubuntu 9.10에서 libboost_python-mt-py26에 libboost_python-mt 링크를 만들어야 할 수도 있습니다 (또는 --boost-python-libname = boost_python-mt-py26 :

$ sudo ln -s / usr / lib / libboost_python-mt-py26 / usr / lib / libboost_python-mt

4. 'nvidia-current'패키지 (예 : 우분투 10.04 Lucid Lynx)를 사용하는 우분투 릴리스에서 다음 링크를 추가해야합니다.

$ ln -sf /usr/lib/nvidia-current/libcuda.so /usr/lib/libcuda.so

5. 우분투 릴리스에서 'cuda3.1'패키지 (우분투 10.04 Lucid Lynx)를 사용하면 다음 링크를 추가해야합니다.

$ sudo ln -s /usr/lib/libboost_python-mt-py26.so /usr/lib/libboost_python-mt.so

6.실행 중일 때 : sudo python setup.py install 다음과 같은 오류가 발생하면 : *** CUDA_ROOT가 설정되지 않았고 경로에 nvcc가 없습니다. 포기해라., sudo는 다른 PATH 값을 가질 수도있다. 다음을 시도해 보라. sudo env PATH = $ PATH python setup.py install

7. cuda 라이브러리 디렉토리 '$ {CUDA_ROOT} / lib', '$ {CUDA_ROOT} / lib64'를 LD_LIBRARY_PATH 변수에 포함하는 것을 잊지 마십시오.

6. PyCUDA 시작

import 가 잘 되는지 확인하라.

import pycuda.driver as cuda
import pycuda.autoinit
from pycuda.compiler import SourceModule

pycuda.autoinit 를 사용할 필요가 없다. 초기화, 컨텍스트 생성 및 리소스 정리는 원하는 경우 수동으로 수행 할 수도 있다.

Data 전달

다음 단계로는 데이터를 장치로 전송하는 것이다. PyCuda에서는 주로 호스트의 numpy 배열에서 데이터를 전송한다. (하지만 실제로 파이썬 버퍼 인터페이스를 만족시키는 모든것은 str이 될 것이다.) 4x4 난수 배열을 만들어 보자.

import numpy
a = numpy.random.randn(4,4)

a는 배정도 숫자로 구성되어 있지만 대부분의 nVidia 장치는 단정밀도 만 지원합니다.

a = a.astype(numpy.float32)

마지막으로 데이터를 전송할 위치가 필요하므로 장치에 메모리를 할당해야합니다.

a_gpu = cuda.mem_alloc(a.nbytes)

마지막 단계로 데이터를 GPU로 전송해야합니다.

cuda.memcpy_htod(a_gpu, a)

Kernel 실행

이 튜토리얼에서는 간단한 것을 가지고 놀아보자: a_gpu의 각 항목을 두 배로하는 코드를 작성하는데 이를 위해 우리는 상응하는 CUDA C 코드를 작성한다

pycuda.compiler.SourceModule:

mod = SourceModule("""
  __global__ void doublify(float *a)
  {
    int idx = threadIdx.x + threadIdx.y*4;
    a[idx] *= 2;
  }
  """)

오류가 없으면 코드가 컴파일되어 장치에 로드 된다. pycuda.driver.Function에 대한 참조를 찾고 이를 인수로 a_gpu를 지정하고 블록 크기를 4x4로 지정한다.

func = mod.get_function("doublify")
func(a_gpu, block=(4,4,1))

마지막으로 GPU에서 데이터를 다시 가져 와서 원본과 함께 표시합니다.

a_doubled = numpy.empty_like(a)
cuda.memcpy_dtoh(a_doubled, a_gpu)
print a_doubled
print a

결과는 아래와 같다

[[ 0.51360393  1.40589952  2.25009012  3.02563429]
 [-0.75841576 -1.18757617  2.72269917  3.12156057]
 [ 0.28826082 -2.92448163  1.21624792  2.86353827]
 [ 1.57651746  0.63500965  2.21570683 -0.44537592]]
[[ 0.25680196  0.70294976  1.12504506  1.51281714]
 [-0.37920788 -0.59378809  1.36134958  1.56078029]
 [ 0.14413041 -1.46224082  0.60812396  1.43176913]
 [ 0.78825873  0.31750482  1.10785341 -0.22268796]]

7. 마지막

이제 본격적으로 하기 위해서는 공식 문서를 참고하면 될 거 같다. PyCUDA 공식문서
CUDA 에 대한 공부는 예제로 배우는CUDA 프로그래밍을 권장한다.

레퍼런스

https://wiki.tiker.net/PyCuda/Installation/Linux/Ubuntu
https://documen.tician.de/pycuda/tutorial.html

저작자표시 비영리 동일조건

'Python' 카테고리의 다른 글

파이썬 동시성 프로그래밍 - (9) 제네레이터 & 코루틴 & asyncio & async/await (2)	2017.04.30
파이썬 동시성 프로그래밍 - (8) Concurrent.Futures & ProcessPoolExecutor (0)	2017.04.27
파이썬 동시성 프로그래밍 - (6) 분산 (celery) (0)	2017.04.27
파이썬 동시성 프로그래밍 - (5) 비동기 (gevent) (0)	2017.04.27
파이썬 동시성 프로그래밍 - (4) multiprocessing (4)	2017.04.27

'Python' Related Articles

Comments