Jump to content
XPEnology Community

DVA3221 loader development thread


Recommended Posts

  • 2 weeks later...
Le 04/05/2023 à 15:52, Orphée a dit :

Your passthrough may not work as intented.

 

On proxmox you would follow the following steps :

 

https://pve.proxmox.com/wiki/PCI_Passthrough#GPU_Passthrough

 

Is everything correctly blacklisted on proxmox host ?

Did you configure correctly your proxmox grub kernel command line ?

Did you confirmed the IOMMU are OK and the GPU has its own IOMMU number ?

 

On ESXi, I can't tell, I don't have it anymore, and never tried to GPU passthrough.

 

 

Edit :

 

Did you look at your device name ?

 

Mine is :

 

01:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU117 [GeForce GTX 1650] [10de:1f82] (rev a1) (prog-if 00 [VGA controller])
        Subsystem: Hewlett-Packard Company Device [103c:8558]
        Flags: bus master, fast devsel, latency 0, IRQ 33
        Memory at c0000000 (32-bit, non-prefetchable) [size=16M]
        Memory at 7000000000 (64-bit, prefetchable) [size=256M]
        Memory at 7010000000 (64-bit, prefetchable) [size=32M]
        I/O ports at d000 [size=128]
        Expansion ROM at c1080000 [virtual] [disabled] [size=512K]
        Capabilities: [60] Power Management version 3
        Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+
        Capabilities: [78] Express Legacy Endpoint, MSI 00
        Capabilities: [100] Virtual Channel
        Capabilities: [250] Latency Tolerance Reporting
        Capabilities: [128] Power Budgeting <?>
        Capabilities: [420] Advanced Error Reporting
        Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
        Kernel driver in use: nvidia

 

Edit 2 :

 

Is your Guest BIOS configured as UEFI ?

 

Hi everyone,

 

Thanks to your comment i have switched to proxmox and succeeded in getting my gpu passthrought working:

Capturedecran2023-06-01a19_56_09.thumb.png.1d8cf78de3f0b17cc5cb21255e76f5bd.png 

 

Now i try to get hardware acceleration working with jellyfin but i can't get it working, i have tried the synocommunity package and the docker installation (even the "/dev/dri/renderD128" trick)  but in both way i can't get it working. Do you have a small tuto on how to get hardware acceleration working with jellyfin ?

 

Regards

 

 

 

 

 

  • Like 1
Link to comment
Share on other sites

/dev/dri only applies for Intel i915.

 

I remember there was a Chinese version of jellyfin and ffmpeg working without too much tricks...

Try search function.

 

And I remember you have to stick to 11.7 and not 11.8, something like that...

 

I still think you should consider Emby.

 

Edited by Orphée
Link to comment
Share on other sites

Hello looking for some help with DVA3221 loading. Anyone that can help is appreciated. @flyride, I saw you had previously helped a person with another loader around troubleshooting their satamap issues (theirs was proxmox and I believe you had to add some coding to ignore the CDROM as a drive). I have a much more simpler setup, but continue to have issues and I'm not sure why. I can easily respond with any/all screenshots and commands to help troubleshoot.

 

Here's my setup. This is a virtual setup using the virtualization on TrueNAS Scale (it's basically QEMU/KVM). I've gotten the DS920+ boot loader to work just fine (seemingly because it uses devicetree and not sataportmap). I had first tried the DS918+ since that was the most recommended, but that behaved the same was as this DVA3221. It boots fine, and I see it in find.synology.com - but it says there are no available disks to install on. 

 

Here's my configuration from TCRP: (it's very simple) - it's one SATA controller with 2 disks on it - the first disk is what TCRP boots off of (.vmdk image) and the second disk is a 200GB virtual disk. When I run the satamap command in rploader - it correctly reports 2 disks, but this is what it reports:

 

tc@box:~$ ./rploader.sh satamap
Machine is VIRTUAL Hypervisor=KVM

Found "00:06.0 Intel Corporation 82801IR/IO/IH (ICH9R/DO/DH) 6 port SATA Controller [AHCI mode] (rev 02)"
Detected 6 ports/2 drives. Mapping SATABOOT drive after maxdisks
WARNING: Other drives are connected that will not be accessible!

Computed settings:
SataPortMap=1
DiskIdxMap=10

Should i update the user_config.json with these values ? [Yy/Nn] y
Done.

 

Here's my lspci:

tc@box:~$ lspci
00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] (rev 02)
00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II]
00:01.1 IDE interface: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton II]
00:01.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 03)
00:02.0 VGA compatible controller: Red Hat, Inc. QXL paravirtual graphic card (rev 05)
00:03.0 Ethernet controller: Intel Corporation 82540EM Gigabit Ethernet Controller (rev 03)
00:04.0 Ethernet controller: Intel Corporation 82540EM Gigabit Ethernet Controller (rev 03)
00:05.0 USB controller: NEC Corporation uPD720200 USB 3.0 Host Controller (rev 03)
00:06.0 SATA controller: Intel Corporation 82801IR/IO/IH (ICH9R/DO/DH) 6 port SATA Controller [AHCI mode] (rev 02)
00:07.0 Communication controller: Red Hat, Inc Virtio console
00:08.0 Unclassified device [00ff]: Red Hat, Inc Virtio memory balloon

 

I'm doing the standard auto hardware discovery build with:

./rploader build dva3221-7.1.1-42962

 

Obviously the NIC works fine as I'm able to find it and connect to it with find.synology.com

 

Again, the weird thing is that the DS920+ worked straight out - I didn't run the satamap command since its not used (devicetree), but the drive was detected just fine there.

FYI - yes, my virtual disks are set to AHCI. (well its only 1 other disk - but also the boot disk with the .vmdk in this case that TCRP is on is also AHCI)

 

Any ideas?

 

Also, if anyone can explain or has a post of exactly what and how these SataPortMap and DiskIdMap variables map to whats on your system would be helpful. I've seen somethings like they should be 0-9 - but then I've seen 15 which I guess is two controllers one with 1 and one with 5 but not 100% sure and then what does the DiskIdMap mean? I'm pretty savvy and maybe I missed something, but again, super close here and can't seem to get over this hump. Thanks in advance!

Edited by god-like
Link to comment
Share on other sites

Ok - After a bit of trial and error. I've gotten somewhere.

 

Since there are 6 ports, I figured first off I should set SataPortMap=6

 

I then set DiskIdxMap=01 (meaning to start with sdb - which is the 2nd drive - 00 being sda which is my vmdk)

 

After doing this - I was able to install, but oddly, it's showing up as disk 3 in the unit. (which doesnt make sense to me as I told it to start the index at the second disk) - unless I'm misunderstanding how this works. Well, regardless, this is something:

 

image.thumb.png.98cb190b3404445fa373e7389d78d4e8.png

Link to comment
Share on other sites

  • 2 weeks later...

Spun up my first DA3662 , then a DVA3221 (w/o surveillance) up on Proxmox 7 server with EPIC processors.  Pretty cool, pretty easy.

I saw that the DVA3221 "should" not run on the EPIC processor because of the MOVBE instruction; it's running but I'm not pushing my luck.

 

That said, Proxmox 8 is out with QEMU 8 which has a new x86-64-v2-AES processor type.  Will that resolve the issue with the MOVBE instruction when the Proxmox server hardware is EPIC?

 

I'm new to this, so this question might be dumb, but a new redpill is needed for Proxmox 8, right?

Thanks

 

Link to comment
Share on other sites

On 6/10/2023 at 5:22 AM, god-like said:

Ok - After a bit of trial and error. I've gotten somewhere.

 

Since there are 6 ports, I figured first off I should set SataPortMap=6

 

I then set DiskIdxMap=01 (meaning to start with sdb - which is the 2nd drive - 00 being sda which is my vmdk)

 

After doing this - I was able to install, but oddly, it's showing up as disk 3 in the unit. (which doesnt make sense to me as I told it to start the index at the second disk) - unless I'm misunderstanding how this works. Well, regardless, this is something:

 

image.thumb.png.98cb190b3404445fa373e7389d78d4e8.png

 

If you have more than one SATA controllers, the minimum disk setting per controller is 1 (setting to zero will KP). SataPortMap is a single character per controller e.g. lets say you have four controllers, but all disks are on fourth controller. You have to specify one disk on first, one on second, one on third and eight on the forth controller, SataportMap=1118. 

 

Now as for DiskIdxMap, it is a two digit hex value per controller. For the above example you can set DiskIdxMap=0A0B0C00 and then the first disk on the first controller will be pushed down to 0A(hex), the second controller disk will be pushed to 0B, the disk on the third controller will be push to 0C and the disk on the forth controller will start from position 00 (slot1 on the schematic), the second will occupy the slot2 etc.

 

 

 

Edited by pocopico
Link to comment
Share on other sites

2 hours ago, Pedulla said:

but a new redpill is needed for Proxmox 8, right?

Why ? it is totally unrelated.

 

But I would not try a very new Proxmox release on PROD environment...

 

I'l personnally wait some month, to see if some 8.1 goes out and fix much of new bugs out with 8.0...

Link to comment
Share on other sites

So just for grins I went and installed Proxmox 8 on an Intel based dual CPU system with a Nvidia Quadro K4000 I picked up used. (Dell Precision 7910)

Then spun up a DVA3221 and passed through the K4000 to the VM.
image.thumb.png.981cf376c49dba3d77b8a9ec275eb071.png

 

It appears to be working with no reported errors. 

Just FYI...

Link to comment
Share on other sites

12 hours ago, Pedulla said:

It appears to be working with no reported errors. 

Correction, vehicle and face recognition both are causing seg-faults using the K4000 passed through.  I thought it was a little too good to be true.

 

So I grabbed an ASUS Phoenix Series 1650 and through it in and took out the K4000 now I'm seeing the same symptoms as @polkue.  The info tab in the DVA3221 shows no GPU install even though lspci -k shows:
 

00:10.0 Class 0300: Device 10de:1f82 (rev a1)
	Subsystem: Device 1043:8773
	Kernel driver in use: nvidia

 

So why wouldn't the DVA3221 not see the GPU?

 

dmesg on the DVA3221 reports:
 

[ 3418.850221] NVRM: GPU 0000:00:10.0: rm_init_adapter failed, device minor number 0
[ 3424.089267] NVRM: GPU 0000:00:10.0: RmInitAdapter failed! (0x23:0x56:515)

 

Any clues?

Link to comment
Share on other sites

1 hour ago, Pedulla said:

Correction, vehicle and face recognition both are causing seg-faults using the K4000 passed through.  I thought it was a little too good to be true.

 

So I grabbed an ASUS Phoenix Series 1650 and through it in and took out the K4000 now I'm seeing the same symptoms as @polkue.  The info tab in the DVA3221 shows no GPU install even though lspci -k shows:
 

00:10.0 Class 0300: Device 10de:1f82 (rev a1)
	Subsystem: Device 1043:8773
	Kernel driver in use: nvidia

 

So why wouldn't the DVA3221 not see the GPU?

 

dmesg on the DVA3221 reports:
 

[ 3418.850221] NVRM: GPU 0000:00:10.0: rm_init_adapter failed, device minor number 0
[ 3424.089267] NVRM: GPU 0000:00:10.0: RmInitAdapter failed! (0x23:0x56:515)

 

Any clues?

It probably should not be at 00.10.0 ...

 

image.png.d81afb1c7b378d30e518c2130100f8f4.png

 

# dmesg |egrep -i "nvidia|gpu|vga"
[    0.502250] vgaarb: device added: PCI:0000:01:00.0,decodes=io+mem,owns=none,locks=none
[    0.504011] vgaarb: loaded
[    0.504271] vgaarb: bridge control possible 0000:01:00.0
[   32.899751] systemd[1]: Created slice NVIDIARuntimeLibrary's slice.
[   32.900844] systemd[1]: Starting NVIDIARuntimeLibrary's slice.
[   33.635292] nvidia: module license 'NVIDIA' taints kernel.
[   33.652604] nvidia-nvlink: Nvlink Core is being initialized, major device number 247
[   33.654804] vgaarb: device changed decodes: PCI:0000:01:00.0,olddecodes=io+mem,decodes=none:owns=none
[   33.698618] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  440.44  Sun Dec  8 03:38:56 UTC 2019
[   33.738736] nvidia-uvm: Loaded the UVM driver, major device number 245.
[   35.020416] NVRM: Persistence mode is deprecated and will be removed in a future release. Please use nvidia-persistenced instead.

 

# lspci -nnkvq |grep -i vga
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU117 [GeForce GTX 1650] [10de:1f82] (rev a1) (prog-if 00 [VGA controller])

# lspci -nnkvq -s 01:00.0
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU117 [GeForce GTX 1650] [10de:1f82] (rev a1) (prog-if 00 [VGA controller])
        Subsystem: Hewlett-Packard Company Device [103c:8558]
        Flags: bus master, fast devsel, latency 0, IRQ 33
        Memory at c0000000 (32-bit, non-prefetchable) [size=16M]
        Memory at 7000000000 (64-bit, prefetchable) [size=256M]
        Memory at 7010000000 (64-bit, prefetchable) [size=32M]
        I/O ports at d000 [size=128]
        Expansion ROM at c1080000 [virtual] [disabled] [size=512K]
        Capabilities: [60] Power Management version 3
        Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+
        Capabilities: [78] Express Legacy Endpoint, MSI 00
        Capabilities: [100] Virtual Channel
        Capabilities: [250] Latency Tolerance Reporting
        Capabilities: [128] Power Budgeting <?>
        Capabilities: [420] Advanced Error Reporting
        Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
        Kernel driver in use: nvidia

 


You should review : https://pve.proxmox.com/wiki/PCI_Passthrough#GPU_passthrough

Edited by Orphée
Link to comment
Share on other sites

sudo lspci -nnkvq -s 01:00.0
Password: 
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU117 [GeForce GTX 1650] [10de:1f82] (rev a1) (prog-if 00 [VGA controller])
	Subsystem: ASUSTeK Computer Inc. Device [1043:8773]
	Flags: bus master, fast devsel, latency 0, IRQ 16
	Memory at fb000000 (32-bit, non-prefetchable) [size=16M]
	Memory at d0000000 (64-bit, prefetchable) [size=256M]
	Memory at e0000000 (64-bit, prefetchable) [size=32M]
	I/O ports at 5000 [size=128]
	Expansion ROM at fc000000 [disabled] [size=512K]
	Capabilities: [60] Power Management version 3
	Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
	Capabilities: [78] Express Legacy Endpoint, MSI 00
	Capabilities: [100] Virtual Channel
	Capabilities: [250] Latency Tolerance Reporting
	Capabilities: [128] Power Budgeting <?>
	Capabilities: [420] Advanced Error Reporting
	Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
	Kernel driver in use: nvidia

Yet...

 

Should be working, no?

 

Screenshot at 2023-06-30 16-40-41.png

Link to comment
Share on other sites

My GTX 1650 it's not working whit DVA analytic.

 

Var/log/messages shows:

2023-07-03T22:33:42+02:00 DVA3221 kernel: [ 1828.861244] NVRM: Xid (PCI:0000:00:10): 31, pid=30352, Ch 00000020, intr 00000000. MMU Fault: ENGINE GRAPHICS GPCCLIENT_T1_2 faulted @ 0x7ff2_80159000. Fault is of type FAULT_PDE ACCESS_TYPE_VIRT_READ

lspci
00:00.0 Class 0600: Device 8086:1237 (rev 02)
00:01.0 Class 0601: Device 8086:7000
00:01.1 Class 0101: Device 8086:7010
00:01.2 Class 0c03: Device 8086:7020 (rev 01)
00:01.3 Class 0680: Device 8086:7113 (rev 03)
00:03.0 Class 00ff: Device 1af4:1002
00:05.0 Class 0604: Device 1b36:0001
00:07.0 Class 0106: Device 8086:2922 (rev 02)
00:10.0 Class 0300: Device 10de:1f82 (rev a1)
00:10.1 Class 0403: Device 10de:10fa (rev a1)
00:12.0 Class 0200: Device 1af4:1000
00:1e.0 Class 0604: Device 1b36:0001
00:1f.0 Class 0604: Device 1b36:0001


lspci -nnkvq -s 00:10.0
00:10.0 VGA compatible controller [0300]: NVIDIA Corporation TU117 [GeForce GTX 1650] [10de:1f82] (rev a1) (prog-if 00 [VGA controller])
        Subsystem: ASUSTeK Computer Inc. Device [1043:86b7]
        Flags: bus master, fast devsel, latency 0, IRQ 28
        Memory at c0000000 (32-bit, non-prefetchable) [size=16M]
        [virtual] Memory at 800000000 (64-bit, prefetchable) [size=256M]
        Memory at 810000000 (64-bit, prefetchable) [size=32M]
        I/O ports at 1000 [size=128]
        [virtual] Expansion ROM at c1660000 [disabled] [size=128K]
        Capabilities: [60] Power Management version 3
        Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+
        Capabilities: [78] Express Legacy Endpoint, MSI 00
        Kernel driver in use: nvidia
 

Link to comment
Share on other sites

On 7/4/2023 at 2:47 AM, Pedulla said:

Do you think it's a manufacturer specific issue?

 

Your's is a HP video card, ours are ASUS....

Can you try changing gpu device id to:

https://www.techpowerup.com/vgabios/246602/246602

 

And 

 

https://www.techpowerup.com/vgabios/209739/msi-gtx1660-6144-190213

 

These 2 work fine for sure, because I have them and tested in baremetal. If it won't work I guess it has something to do with proxmox or incorrect passthrough. If it will work with following device ids then its definitely vendor problem.

Edited by dimakv2014
Typo
Link to comment
Share on other sites

23 minutes ago, Orphée said:

Just saying I HAVE a HP card... With same pciids as above ..

By the way found another loader which is not on our forum

https://github.com/AuxXxilium/arc

 

As far as I see it is a clone of an ARPL with some modifications, some have reported working HP Gen8 with dva3219/dva3221. I have tested it for few days, managed to run both dva3221 and dva3219 on Celeron N2807 which is isn't supposed to be working but it worked after doing 918+ first then migrate to dva. So far stable

Screenshot_2023-07-07-02-22-50-656_com.miui.gallery.jpg

Link to comment
Share on other sites

I'm pretty sure the manufacturer as no impact as long as you have a "10de:1f82" detected GTX1650.

Recheck your passthrough settings. 

 

On proxmox host :

 

iommu.sh

#!/bin/bash
shopt -s nullglob
for g in $(find /sys/kernel/iommu_groups/* -maxdepth 0 -type d | sort -V); do
    echo "IOMMU Group ${g##*/}:"
    for d in $g/devices/*; do
        echo -e "\t$(lspci -nns ${d##*/})"
    done;
done;

 

# ./iommu.sh 
IOMMU Group 0:
        00:00.0 Host bridge [0600]: Intel Corporation 8th/9th Gen Core 8-core Desktop Processor Host Bridge/DRAM Registers [Coffee Lake S] [8086:3e30] (rev 0a)
IOMMU Group 1:
        00:01.0 PCI bridge [0604]: Intel Corporation 6th-10th Gen Core Processor PCIe Controller (x16) [8086:1901] (rev 0a)
IOMMU Group 2:
        00:01.1 PCI bridge [0604]: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x8) [8086:1905] (rev 0a)
IOMMU Group 3:
        00:02.0 Display controller [0380]: Intel Corporation CoffeeLake-S GT2 [UHD Graphics 630] [8086:3e98]
IOMMU Group 4:
        00:08.0 System peripheral [0880]: Intel Corporation Xeon E3-1200 v5/v6 / E3-1500 v5 / 6th/7th/8th Gen Core Processor Gaussian Mixture Model [8086:1911]
IOMMU Group 5:
        00:12.0 Signal processing controller [1180]: Intel Corporation Cannon Lake PCH Thermal Controller [8086:a379] (rev 10)
IOMMU Group 6:
        00:14.0 USB controller [0c03]: Intel Corporation Cannon Lake PCH USB 3.1 xHCI Host Controller [8086:a36d] (rev 10)
        00:14.2 RAM memory [0500]: Intel Corporation Cannon Lake PCH Shared SRAM [8086:a36f] (rev 10)
IOMMU Group 7:
        00:15.0 Serial bus controller [0c80]: Intel Corporation Cannon Lake PCH Serial IO I2C Controller #0 [8086:a368] (rev 10)
        00:15.1 Serial bus controller [0c80]: Intel Corporation Cannon Lake PCH Serial IO I2C Controller #1 [8086:a369] (rev 10)
IOMMU Group 8:
        00:16.0 Communication controller [0780]: Intel Corporation Cannon Lake PCH HECI Controller [8086:a360] (rev 10)
IOMMU Group 9:
        00:17.0 SATA controller [0106]: Intel Corporation Cannon Lake PCH SATA AHCI Controller [8086:a352] (rev 10)
IOMMU Group 10:
        00:1b.0 PCI bridge [0604]: Intel Corporation Cannon Lake PCH PCI Express Root Port #17 [8086:a340] (rev f0)
IOMMU Group 11:
        00:1b.4 PCI bridge [0604]: Intel Corporation Cannon Lake PCH PCI Express Root Port #21 [8086:a32c] (rev f0)
IOMMU Group 12:
        00:1c.0 PCI bridge [0604]: Intel Corporation Cannon Lake PCH PCI Express Root Port #1 [8086:a338] (rev f0)
IOMMU Group 13:
        00:1c.2 PCI bridge [0604]: Intel Corporation Cannon Lake PCH PCI Express Root Port #3 [8086:a33a] (rev f0)
IOMMU Group 14:
        00:1c.5 PCI bridge [0604]: Intel Corporation Cannon Lake PCH PCI Express Root Port #6 [8086:a33d] (rev f0)
IOMMU Group 15:
        00:1c.6 PCI bridge [0604]: Intel Corporation Cannon Lake PCH PCI Express Root Port #7 [8086:a33e] (rev f0)
IOMMU Group 16:
        00:1c.7 PCI bridge [0604]: Intel Corporation Cannon Lake PCH PCI Express Root Port #8 [8086:a33f] (rev f0)
IOMMU Group 17:
        00:1e.0 Communication controller [0780]: Intel Corporation Cannon Lake PCH Serial IO UART Host Controller [8086:a328] (rev 10)
IOMMU Group 18:
        00:1f.0 ISA bridge [0601]: Intel Corporation Cannon Point-LP LPC Controller [8086:a309] (rev 10)
        00:1f.3 Audio device [0403]: Intel Corporation Cannon Lake PCH cAVS [8086:a348] (rev 10)
        00:1f.4 SMBus [0c05]: Intel Corporation Cannon Lake PCH SMBus Controller [8086:a323] (rev 10)
        00:1f.5 Serial bus controller [0c80]: Intel Corporation Cannon Lake PCH SPI Controller [8086:a324] (rev 10)
        00:1f.6 Ethernet controller [0200]: Intel Corporation Ethernet Connection (7) I219-LM [8086:15bb] (rev 10)
IOMMU Group 19:
        01:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU117 [GeForce GTX 1650] [10de:1f82] (rev a1)
IOMMU Group 20:
        01:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:10fa] (rev a1)
IOMMU Group 21:
        02:00.0 VGA compatible controller [0300]: NVIDIA Corporation GM204 [GeForce GTX 980] [10de:13c0] (rev a1)
IOMMU Group 22:
        02:00.1 Audio device [0403]: NVIDIA Corporation GM204 High Definition Audio Controller [10de:0fbb] (rev a1)
IOMMU Group 23:
        04:00.0 Audio device [0403]: Creative Labs Sound Core3D [Sound Blaster Recon3D / Z-Series] [1102:0012] (rev 01)
IOMMU Group 24:
        06:00.0 USB controller [0c03]: Renesas Technology Corp. uPD720201 USB 3.0 Host Controller [1912:0014] (rev 03)
IOMMU Group 25:
        07:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03)
IOMMU Group 26:
        08:00.0 PCI bridge [0604]: ASPEED Technology, Inc. AST1150 PCI-to-PCI Bridge [1a03:1150] (rev 04)
        09:00.0 VGA compatible controller [0300]: ASPEED Technology, Inc. ASPEED Graphics Family [1a03:2000] (rev 41)
IOMMU Group 27:
        0a:00.0 PCI bridge [0604]: Tundra Semiconductor Corp. Device [10e3:8113] (rev 01)

 

Check your GTX1650 is alone like mine in Group 19.

 

# dmesg |grep -i -E "dmar|iommu|nvidia"
[    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.15.108-1-pve root=/dev/mapper/pve-root ro quiet intel_iommu=on iommu=pt pcie_acs_override=downstream,multifunction initcall_blacklist=sysfb_init
[    0.000000] Warning: PCIe ACS overrides enabled; This may allow non-IOMMU protected peer-to-peer DMA
[    0.008369] ACPI: DMAR 0x000000003B2401D8 0000C8 (v01 INTEL  EDK2     00000002      01000013)
[    0.008405] ACPI: Reserving DMAR table memory at [mem 0x3b2401d8-0x3b24029f]
[    0.078447] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.15.108-1-pve root=/dev/mapper/pve-root ro quiet intel_iommu=on iommu=pt pcie_acs_override=downstream,multifunction initcall_blacklist=sysfb_init
[    0.078496] DMAR: IOMMU enabled
[    0.221225] DMAR: Host address width 39
[    0.221225] DMAR: DRHD base: 0x000000fed90000 flags: 0x0
[    0.221230] DMAR: dmar0: reg_base_addr fed90000 ver 1:0 cap 1c0000c40660462 ecap 19e2ff0505e
[    0.221232] DMAR: DRHD base: 0x000000fed91000 flags: 0x1
[    0.221234] DMAR: dmar1: reg_base_addr fed91000 ver 1:0 cap d2008c40660462 ecap f050da
[    0.221236] DMAR: RMRR base: 0x0000003b699000 end: 0x0000003b8e2fff
[    0.221237] DMAR: RMRR base: 0x0000003d000000 end: 0x0000003f7fffff
[    0.221238] DMAR: RMRR base: 0x0000003acde000 end: 0x0000003ad5dfff
[    0.221240] DMAR-IR: IOAPIC id 2 under DRHD base  0xfed91000 IOMMU 1
[    0.221241] DMAR-IR: HPET id 0 under DRHD base 0xfed91000
[    0.221242] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping.
[    0.224362] DMAR-IR: Enabled IRQ remapping in x2apic mode
[    0.529184] iommu: Default domain type: Passthrough (set via kernel command line)
[    0.636750] DMAR: No ATSR found
[    0.636751] DMAR: No SATC found
[    0.636752] DMAR: IOMMU feature fl1gp_support inconsistent
[    0.636753] DMAR: IOMMU feature pgsel_inv inconsistent
[    0.636754] DMAR: IOMMU feature nwfs inconsistent
[    0.636754] DMAR: IOMMU feature pasid inconsistent
[    0.636755] DMAR: IOMMU feature eafs inconsistent
[    0.636756] DMAR: IOMMU feature prs inconsistent
[    0.636756] DMAR: IOMMU feature nest inconsistent
[    0.636757] DMAR: IOMMU feature mts inconsistent
[    0.636757] DMAR: IOMMU feature sc_support inconsistent
[    0.636758] DMAR: IOMMU feature dev_iotlb_support inconsistent
[    0.636759] DMAR: dmar0: Using Queued invalidation
[    0.636761] DMAR: dmar1: Using Queued invalidation
[    0.637004] pci 0000:00:00.0: Adding to iommu group 0
[    0.637015] pci 0000:00:01.0: Adding to iommu group 1
[    0.637024] pci 0000:00:01.1: Adding to iommu group 2
[    0.637033] pci 0000:00:02.0: Adding to iommu group 3
[    0.637040] pci 0000:00:08.0: Adding to iommu group 4
[    0.637052] pci 0000:00:12.0: Adding to iommu group 5
[    0.637067] pci 0000:00:14.0: Adding to iommu group 6
[    0.637074] pci 0000:00:14.2: Adding to iommu group 6
[    0.637088] pci 0000:00:15.0: Adding to iommu group 7
[    0.637095] pci 0000:00:15.1: Adding to iommu group 7
[    0.637107] pci 0000:00:16.0: Adding to iommu group 8
[    0.637114] pci 0000:00:17.0: Adding to iommu group 9
[    0.637141] pci 0000:00:1b.0: Adding to iommu group 10
[    0.637161] pci 0000:00:1b.4: Adding to iommu group 11
[    0.637191] pci 0000:00:1c.0: Adding to iommu group 12
[    0.637212] pci 0000:00:1c.2: Adding to iommu group 13
[    0.637227] pci 0000:00:1c.5: Adding to iommu group 14
[    0.637249] pci 0000:00:1c.6: Adding to iommu group 15
[    0.637269] pci 0000:00:1c.7: Adding to iommu group 16
[    0.637280] pci 0000:00:1e.0: Adding to iommu group 17
[    0.637304] pci 0000:00:1f.0: Adding to iommu group 18
[    0.637312] pci 0000:00:1f.3: Adding to iommu group 18
[    0.637320] pci 0000:00:1f.4: Adding to iommu group 18
[    0.637329] pci 0000:00:1f.5: Adding to iommu group 18
[    0.637338] pci 0000:00:1f.6: Adding to iommu group 18
[    0.637351] pci 0000:01:00.0: Adding to iommu group 19
[    0.637360] pci 0000:01:00.1: Adding to iommu group 20
[    0.637371] pci 0000:02:00.0: Adding to iommu group 21
[    0.637381] pci 0000:02:00.1: Adding to iommu group 22
[    0.637404] pci 0000:04:00.0: Adding to iommu group 23
[    0.637421] pci 0000:06:00.0: Adding to iommu group 24
[    0.637443] pci 0000:07:00.0: Adding to iommu group 25
[    0.637466] pci 0000:08:00.0: Adding to iommu group 26
[    0.637469] pci 0000:09:00.0: Adding to iommu group 26
[    0.637489] pci 0000:0a:00.0: Adding to iommu group 27
[    0.637603] DMAR: Intel(R) Virtualization Technology for Directed I/O

 

 

Try go baremetal to confirm.

 

 

Edit : On proxmox host, to confirm nvidia is blacklisted

Quote

 

01:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU117 [GeForce GTX 1650] [10de:1f82] (rev a1) (prog-if 00 [VGA controller])
        Subsystem: Hewlett-Packard Company TU117 [GeForce GTX 1650] [103c:8558]
        Flags: bus master, fast devsel, latency 0, IRQ 139, IOMMU group 19
        Memory at 82000000 (32-bit, non-prefetchable) [size=16M]
        Memory at 70000000 (64-bit, prefetchable) [size=256M]
        Memory at 80000000 (64-bit, prefetchable) [size=32M]
        I/O ports at 6000 [size=128]
        Expansion ROM at 83000000 [disabled] [size=512K]
        Capabilities: [60] Power Management version 3
        Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+
        Capabilities: [78] Express Legacy Endpoint, MSI 00
        Capabilities: [100] Virtual Channel
        Capabilities: [250] Latency Tolerance Reporting
        Capabilities: [258] L1 PM Substates
        Capabilities: [128] Power Budgeting <?>
        Capabilities: [420] Advanced Error Reporting
        Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
        Capabilities: [900] Secondary PCI Express
        Capabilities: [bb0] Physical Resizable BAR
        Kernel driver in use: vfio-pci
        Kernel modules: nvidiafb, nouveau

01:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:10fa] (rev a1)
        Subsystem: Hewlett-Packard Company Device [103c:8558]
        Flags: fast devsel, IRQ 17, IOMMU group 20
        Memory at 83080000 (32-bit, non-prefetchable) [size=16K]
        Capabilities: [60] Power Management version 3
        Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
        Capabilities: [78] Express Endpoint, MSI 00
        Capabilities: [100] Advanced Error Reporting
        Kernel driver in use: vfio-pci
        Kernel modules: snd_hda_intel

 

 

 

On the Synology NAS (root)

NAS:~# dmesg |grep -i -E "nvidia|nvrm|vga"
[    0.344040] vgaarb: device added: PCI:0000:01:00.0,decodes=io+mem,owns=none,locks=none
[    0.344593] vgaarb: loaded
[    0.344787] vgaarb: bridge control possible 0000:01:00.0
[   31.569866] systemd[1]: Created slice NVIDIARuntimeLibrary's slice.
[   31.570382] systemd[1]: Starting NVIDIARuntimeLibrary's slice.
[   32.567414] nvidia: module license 'NVIDIA' taints kernel.
[   32.579770] nvidia-nvlink: Nvlink Core is being initialized, major device number 247
[   32.581126] vgaarb: device changed decodes: PCI:0000:01:00.0,olddecodes=io+mem,decodes=none:owns=none
[   32.630064] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  440.44  Sun Dec  8 03:38:56 UTC 2019
[   32.646607] nvidia-uvm: Loaded the UVM driver, major device number 245.
[   33.567357] NVRM: Persistence mode is deprecated and will be removed in a future release. Please use nvidia-persistenced instead

NAS:~# nvidia-smi 
Fri Jul  7 10:17:13 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.44       Driver Version: 440.44       CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1650    On   | 00000000:01:00.0 Off |                  N/A |
| 43%   63C    P0    33W /  75W |   1913MiB /  3911MiB |     36%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0     22099      C   ...anceStation/target/synodva/bin/synodvad   970MiB |
|    0     22181      C   ...ceStation/target/synoface/bin/synofaced   932MiB |
+-----------------------------------------------------------------------------+

 

Edited by Orphée
  • Like 1
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...