分类 文献阅读 下的文章

【笔记】在联想小新潮7000-15ARR+Manjaro Linux使用pyopencl和amd开源显卡驱动的安装配置

个人笔记,如有转载,请注明出处。
——by realasking


众所周知,amdgpu开源驱动的opengl硬件加速性能比amdgpu-pro闭源显卡更好,但其通过mesa实现的opencl只能支持到1.1版,而且并不总能正常的工作。如果能在开源显卡驱动的基础上安装闭源驱动中的opencl支持,显然是非常好的事情。好在AUR中就提供有这样的方案。

安装

yaourt -S opencl-amd ocl-icd opencl-headers clinfo pyopencl-headers python-pyopencl

检查


[[email protected] ~]$ clinfo 
Number of platforms                               2
  Platform Name                                   AMD Accelerated Parallel Processing
  Platform Vendor                                 Advanced Micro Devices, Inc.
  Platform Version                                OpenCL 2.1 AMD-APP (2906.7)
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_icd cl_amd_event_callback cl_amd_offline_devices 
  Platform Host timer resolution                  1ns
  Platform Extensions function suffix             AMD

Platform Name                                   AMD Accelerated Parallel Processing
  Platform Vendor                                 Advanced Micro Devices, Inc.
  Platform Version                                OpenCL 2.1 AMD-APP (2906.7)
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_icd cl_amd_event_callback cl_amd_offline_devices 
  Platform Host timer resolution                  1ns
  Platform Extensions function suffix             AMD

Platform Name                                   AMD Accelerated Parallel Processing
Number of devices                                 1
  Device Name                                     gfx804
  Device Vendor                                   Advanced Micro Devices, Inc.
  Device Vendor ID                                0x1002
  Device Version                                  OpenCL 1.2 AMD-APP (2906.7)
  Driver Version                                  2906.7
  Device OpenCL C Version                         OpenCL C 1.2 
  Device Type                                     GPU
  Device Board Name (AMD)                         Radeon 500 Series
  Device Topology (AMD)                           PCI-E, 01:00.0
  Device Profile                                  FULL_PROFILE
  Device Available                                Yes
  Compiler Available                              Yes
  Linker Available                                Yes
  Max compute units                               8
  SIMD per compute unit (AMD)                     4
  SIMD width (AMD)                                16
  SIMD instruction width (AMD)                    1
  Max clock frequency                             1124MHz
  Graphics IP (AMD)                               8.0
  Device Partition                                (core)
    Max number of sub-devices                     8
    Supported partition types                     None
    Supported affinity domains                    (n/a)
  Max work item dimensions                        3
  Max work item sizes                             1024x1024x1024
  Max work group size                             256
  Preferred work group size (AMD)                 256
  Max work group size (AMD)                       1024
  Preferred work group size multiple              64
  Wavefront width (AMD)                           64
  Preferred / native vector sizes
char 4 / 4
short 2 / 2
int 1 / 1
long 1 / 1
half 1 / 1 (cl_khr_fp16) float 1 / 1
double 1 / 1 (cl_khr_fp64) Half-precision Floating-point support (cl_khr_fp16) Denormals No Infinity and NANs No Round to nearest No Round to zero No Round to infinity No IEEE754-2008 fused multiply-add No Support is emulated in software No Single-precision Floating-point support (core) Denormals No Infinity and NANs Yes Round to nearest Yes Round to zero Yes Round to infinity Yes IEEE754-2008 fused multiply-add Yes Support is emulated in software No Correctly-rounded divide and sqrt operations Yes Double-precision Floating-point support (cl_khr_fp64) Denormals Yes Infinity and NANs Yes Round to nearest Yes Round to zero Yes Round to infinity Yes IEEE754-2008 fused multiply-add Yes Support is emulated in software No Address bits 64, Little-Endian Global memory size 2141634560 (1.995GiB) Global free memory (AMD) 2071608 (1.976GiB) Global memory channels (AMD) 2 Global memory banks per channel (AMD) 16 Global memory bank width (AMD) 256 bytes Error Correction support No Max memory allocation 1596849766 (1.487GiB) Unified memory for Host and Device No Minimum alignment for any data type 128 bytes Alignment of base address 2048 bits (256 bytes) Global Memory cache type Read/Write Global Memory cache size 16384 (16KiB) Global Memory cache line size 64 bytes Image support Yes Max number of samplers per kernel 16 Max size for 1D images from buffer 134217728 pixels Max 1D or 2D image array size 2048 images Base address alignment for 2D image buffers 256 bytes Pitch alignment for 2D image buffers 256 pixels Max 2D image size 16384x16384 pixels Max 3D image size 2048x2048x2048 pixels Max number of read image args 128 Max number of write image args 8 Local memory type Local Local memory size 32768 (32KiB) Local memory syze per CU (AMD) 65536 (64KiB) Local memory banks (AMD) 32 Max number of constant args 8 Max constant buffer size 1596849766 (1.487GiB) Preferred constant buffer size (AMD) 16384 (16KiB) Max size of kernel argument 1024 Queue properties
Out-of-order execution No Profiling Yes Prefer user sync for interop Yes Profiling timer resolution 1ns Profiling timer offset since Epoch (AMD) 1580776646713521172ns (Tue Feb 4 08:37:26 2020) Execution capabilities
Run OpenCL kernels Yes Run native kernels No Thread trace supported (AMD) Yes Number of async queues (AMD) 2 Max real-time compute queues (AMD) 0 Max real-time compute units (AMD) 0 SPIR versions 1.2 printf() buffer size 4194304 (4MiB) Built-in kernels (n/a) Device Extensions cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_image2d_from_buffer cl_amd_bus_addressable_memory cl_khr_spir cl_khr_gl_event Platform Name AMD Accelerated Parallel Processing Number of devices 1 Device Name gfx902 Device Vendor Advanced Micro Devices, Inc. Device Vendor ID 0x1002 Device Version OpenCL 2.0 AMD-APP (2906.7) Driver Version 2906.7 (PAL,HSAIL) Device OpenCL C Version OpenCL C 2.0 Device Type GPU Device Board Name (AMD) Unknown AMD GPU Device Topology (AMD) PCI-E, 05:00.0 Device Profile FULL_PROFILE Device Available Yes Compiler Available Yes Linker Available Yes Max compute units 8 SIMD per compute unit (AMD) 4 SIMD width (AMD) 16 SIMD instruction width (AMD) 1 Max clock frequency 1100MHz Graphics IP (AMD) 9.2 Device Partition (core) Max number of sub-devices 8 Supported partition types None Supported affinity domains (n/a) Max work item dimensions 3 Max work item sizes 1024x1024x1024 Max work group size 256 Preferred work group size (AMD) 256 Max work group size (AMD) 1024 Preferred work group size multiple 64 Wavefront width (AMD) 64 Preferred / native vector sizes
char 4 / 4
short 2 / 2
int 1 / 1
long 1 / 1
half 1 / 1 (cl_khr_fp16) float 1 / 1
double 1 / 1 (cl_khr_fp64) Half-precision Floating-point support (cl_khr_fp16) Denormals No Infinity and NANs No Round to nearest No Round to zero No Round to infinity No IEEE754-2008 fused multiply-add No Support is emulated in software No Single-precision Floating-point support (core) Denormals No Infinity and NANs Yes Round to nearest Yes Round to zero Yes Round to infinity Yes IEEE754-2008 fused multiply-add Yes Support is emulated in software No Correctly-rounded divide and sqrt operations Yes Double-precision Floating-point support (cl_khr_fp64) Denormals Yes Infinity and NANs Yes Round to nearest Yes Round to zero Yes Round to infinity Yes IEEE754-2008 fused multiply-add Yes Support is emulated in software No Address bits 64, Little-Endian Global memory size 2684354560 (2.5GiB) Global free memory (AMD) 2553760 (2.435GiB) Global memory channels (AMD) 4 Global memory banks per channel (AMD) 4 Global memory bank width (AMD) 256 bytes Error Correction support No Max memory allocation 912680550 (870.4MiB) Unified memory for Host and Device Yes Shared Virtual Memory (SVM) capabilities (core) Coarse-grained buffer sharing Yes Fine-grained buffer sharing Yes Fine-grained system sharing No Atomics No Minimum alignment for any data type 128 bytes Alignment of base address 2048 bits (256 bytes) Preferred alignment for atomics
SVM 0 bytes Global 0 bytes Local 0 bytes Max size for global variable 821412352 (783.4MiB) Preferred total size of global vars 2684354560 (2.5GiB) Global Memory cache type Read/Write Global Memory cache size 16384 (16KiB) Global Memory cache line size 64 bytes Image support Yes Max number of samplers per kernel 16 Max size for 1D images from buffer 134217728 pixels Max 1D or 2D image array size 2048 images Base address alignment for 2D image buffers 256 bytes Pitch alignment for 2D image buffers 256 pixels Max 2D image size 16384x16384 pixels Max 3D image size 2048x2048x2048 pixels Max number of read image args 128 Max number of write image args 64 Max number of read/write image args 64 Max number of pipe args 16 Max active pipe reservations 16 Max pipe packet size 912680550 (870.4MiB) Local memory type Local Local memory size 65536 (64KiB) Local memory syze per CU (AMD) 65536 (64KiB) Local memory banks (AMD) 32 Max number of constant args 8 Max constant buffer size 912680550 (870.4MiB) Preferred constant buffer size (AMD) 16384 (16KiB) Max size of kernel argument 1024 Queue properties (on host)
Out-of-order execution No Profiling Yes Queue properties (on device)
Out-of-order execution Yes Profiling Yes Preferred size 262144 (256KiB) Max size 8388608 (8MiB) Max queues on device 1 Max events on device 1024 Prefer user sync for interop Yes Number of P2P devices (AMD) 0 P2P devices (AMD) (n/a) Profiling timer resolution 1ns Profiling timer offset since Epoch (AMD) 1580776646713519545ns (Tue Feb 4 08:37:26 2020) Execution capabilities
Run OpenCL kernels Yes Run native kernels No Thread trace supported (AMD) Yes Number of async queues (AMD) 4 Max real-time compute queues (AMD) 0 Max real-time compute units (AMD) 0 SPIR versions 1.2 printf() buffer size 4194304 (4MiB) Built-in kernels (n/a) Device Extensions cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_khr_gl_depth_images cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_image2d_from_buffer cl_khr_spir cl_khr_subgroups cl_khr_gl_event cl_khr_depth_images cl_khr_mipmap_image cl_khr_mipmap_image_writes cl_amd_copy_buffer_p2p NULL platform behavior clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...) AMD Accelerated Parallel Processing clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...) Success [AMD] clCreateContext(NULL, ...) [default] Success [AMD] clCreateContext(NULL, ...) [other] Success [AMD] clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT) Success (1) Platform Name AMD Accelerated Parallel Processing Device Name gfx804 clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) No devices found in platform clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) Success (1) Platform Name AMD Accelerated Parallel Processing Device Name gfx804 clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No devices found in platform clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) No devices found in platform clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) Success (1) Platform Name AMD Accelerated Parallel Processing Device Name gfx804 ICD loader properties ICD loader Name OpenCL ICD Loader ICD loader Vendor OCL Icd free software ICD loader Version 2.2.12 ICD loader Profile OpenCL 2.2

切换显卡

使用环境变量PYOPENCL_CTX='0''1'

2019年在CentOS上架设用systemd管理的anki 2.1同步服务器

个人笔记,如有转载,请注明出处。

——by realasking

最近收到ankiweb的通知,告知因为长时间没有登录,我的账户已经被注销了。我确有用记忆卡片的需要,但是最近太慢,实在顾不上总去使用它,anki的做法让人很无奈。鉴于此,还是决定自己搭建一个anki同步服务器。

github上的tsudoko建立了一个项目[1],可以搭建anki 2.1的同步服务;在archlinuxAUR网站上,s7hoang则对它进行了规范化的打包[2],可以使用systemd来启动和管理这一同步服务器。不过,CentOSArchlinux有一些不同,s7hoang的方案用在CentOS服务器上需要做一定的调整。

s7hoang的方案进行小调整后,我在CentOS 7服务器上安装anki-sync-server的过程如下。

1. 准备软件包

a.下载s7hoangAUR

[[email protected]]$ wget https://aur.archlinux.org/cgit/aur.git/snapshot/anki-sync-server-git.tar.gz
[[email protected]]$ tar -xzvf anki-sync-server-git.tar.gz
[[email protected]]$ cd anki-sync-server-git

b.准备服务器包

  • 修改s7hoang提供的anki-sync-server.service,内容改为:
[Unit]
Description=A sync server for anki
After=network.target

[Service]
Type=simple
User=anki-sync-server
Group=anki-sync-server
WorkingDirectory=/opt/anki-sync-server
ExecStart=/usr/bin/python36 -m ankisyncd
StartLimitIntervalSec=1
Restart=always
RestartSec=5

[Install]
WantedBy=multi-user.target
  • 根据s7hoang的方法创立脚本genpack.sh,内容如下:
#!/bin/bash 
git clone https://github.com/tsudoko/anki-sync-server
cd anki-sync-server
mkdir -p "plugins/anki2.0"
mkdir -p "plugins/anki2.1/ankisyncd"
mkdir -p "plugins/systemd"
cp ../anki-sync-server.py "plugins/anki2.0"
cp ../__init__.py "plugins/anki2.1/ankisyncd"
cp ../anki-sync-server.service "plugins/systemd"
sed -i "2s/0\.0\.0\.0/$(ip route get 1.2.3.4 | awk '{print $7}')/" \
  plugins/anki2.0/anki-sync-server.py
sed -i "3s/0\.0\.0\.0/$(ip route get 1.2.3.4 | awk '{print $7}')/" \
  plugins/anki2.0/anki-sync-server.py
sed -i "3s/0\.0\.0\.0/$(ip route get 1.2.3.4 | awk '{print $7}')/" \
  plugins/anki2.1/ankisyncd/__init__.py
sed -i "s/python/python36/g" ankisyncctl.py

git submodule update --init

然后执行这个脚本。

  • 建立nginx服务器脚本anki.conf,内容如下:
server {
        listen 80;
        server_name anki.myserver.tk;
        return 301 https://$server_name$request_uri;
}

server {
    # Allow access via HTTPS
    listen 443 ssl http2;
    listen [::]:443 ssl http2;

    # Set server names for access
    server_name anki.服务器名.tk;

    # Set TLS certificates to use for HTTPS access
    ssl_certificate /etc/letsencrypt/live/服务器名.tk/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/服务器名.tk/privkey.pem;

    location / {
        # Prevent nginx from rejecting larger media files
        client_max_body_size 0;

        proxy_pass http://127.0.0.1:27701;
    }
}

2. 部署服务器

  • root登录服务器,将anki-sync-server文件夹拷贝到/opt

  • anki-sync-server.service文件拷贝到/etc/systemd/system

  • anki.conf文件拷贝到:/etc/nginx/conf.d

  • 建立anki用户

[[email protected]]# useradd -d /opt/anki-sync-server -r -s /sbin/nologin anki-sync-server
[[email protected]]# chown -R anki-sync-server /opt/anki-sync-server
[[email protected]]# chgrp -R anki-sync-server /opt/anki-sync-server
[[email protected]]# sudo -u anki-sync-server ./ankisyncctl.py adduser
  • 开启服务
[[email protected]]# systemctl enable anki-sync-server
[[email protected]]# systemctl start anki-sync-server
[[email protected]]# systemctl restart nginx

3.关于依赖关系

依赖关系的解决本文未提及,因为[1]中已经讲得很清楚了,大家可以直接查阅原作者的叙述。

参考链接

[1] tsudokoanki 2.1同步服务器项目

[2] s7hoangsystemd启动tsudoko服务器的打包

Word 2016一次删除全部noteexpress引用的方法

个人原创,如有转载,请注明出处。

——by realasking

之前投稿被拒,要更换期刊,结果在用noteexpress(NE)格式化参考文献时发现,有些引用文献的期刊名被正确缩写,有些缩写是错误的,有些则完全没被缩写。将NE自带的对应期刊的Reference Style(RS)复制了一份,然后对照csl修改未果,将几种内置缩写格式都试了一遍,也都不正确,万般无奈下决定更换Mendeley。

要更换Mendeley,需要先删除掉NE已经插入的引用。NE是采用域的形式添加引用的,在网上搜索,很多人都提到搜索全部域,然后删除即可,但是这样会伤及无辜,影响已输入公式、晶向、通讯作者等信息。最后查找微软帮助,解决问题,过程如下:

1. 在正文中显示域代码

在联想笔记本上输入:Alt+Fn+F9

在其它笔记本上输入:Alt+F9

2.打开搜索

Ctrl+H

3.搜索特定域,并用空内容替换

点击替换

点击特殊格式,在弹出菜单中选择域,此时在查找内容中出现:

^d

在其后输入 ADDIN NE.Ref.

即:^d ADDIN NE.Ref.

保持替换为中啥都没有

点击全部替换

4.关闭对话框

5.隐藏域代码

操作同1.

然后就可以看到,NE插入的东西都很轻松的干掉了。

[个人笔记]两种翻译整篇pdf文献的办法

个人笔记,如有转载,请注明出处。

——by realasking

最近在写一些东西,需要翻译一些文献以初步筛选,但是受限于自己的英语水平,完全手动弄太慢了,于是打算找一些自动翻译的办法。

但是尝试了很多办法,遇到了若干问题,比如:(1)谷歌翻译在国内时灵时不灵、且只能翻译一部分,(2)百度翻译必须手动一段一段的粘,(3)先将pdf转换为html、再用浏览器插件调用百度或谷歌翻译会出现某些段翻译了,某些段还是英文的情况,(4)迅捷pdf在线翻译直接传pdf译文和原文就重叠在一起、并且原文未全部翻译,而传转换后的word文档会发现翻译的内容只有前4页或前5页,(5)网上若干python代码都有版本兼容性问题,有些还仅限于特定的操作系统,解决起来很麻烦,并且有些解决之后,使用中发现还是只能翻译文献的部分内容。(6)一些离线的翻译软件现在已经无法安装或安装后无法使用了,新版本没有找到。

最后确定必须使用在线的翻译,而且只能上传word文件,pdf文件翻译会遇到很多问题,具体做法分为两步:

1.把pdf转为word

直接使用联想随小新潮7000预装的office2016里的word打开pdf文件就可以完成转换,然后另存就行了,不需要使用其它工具。

2.把word文档翻译为中文

这一步有两个做法:

一是安装巴比伦客户端,然后直接在文件上单击右键,在弹出菜单上选翻译即可。

这个做法很方便,但是翻译得非常粗糙,基本只有摘要可以勉强看看。

巴比伦客户端的下载地址是:

https://www.babylon-software.com/

另一是访问彩云科技 ,然后用微信扫码登录,并点击在线翻译。

在线翻译左边有一个上传文档按钮,点开上传word文档,然后点击翻译。

翻译完成后,不要点击下载,而是要点击个人中心,在我的文档里点击Word下载,这样就可以获得完整的、基本能看的翻译文献了。