├── README ├── .gitignore ├── 5.4.0.104 └── README ├── 4.15.0.45 ├── README └── link_mgmt-4.15.0-45.48.patch └── 4.18.0.22 ├── README ├── ntb_link_mgmt_Ubuntu-hwe-4.18.0-22.23.patch └── nvscic2c_Ubuntu-hwe-4.18.0-22.23.patch /README: -------------------------------------------------------------------------------- 1 | README for bringup of NvSciC2C on x86. 2 | - With DRIVE OS 5.1.6 release use 4.15.0.45/ 3 | - With DRIVE OS 5.1.9 release use 4.18.0.22/ 4 | - With DRIVE OS 6.0.3 release use 5.4.0.104/ 5 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # 2 | # NOTE! Don't add files that are generated in specific 3 | # subdirectories here. Add them in the ".gitignore" file 4 | # in that subdirectory instead. 5 | # 6 | # NOTE! Please use 'git ls-files -i --exclude-standard' 7 | # command after changing this file, to see if there are 8 | # any tracked files which get ignored after the change. 9 | # 10 | # Normal rules (sorted alphabetically) 11 | # 12 | .* 13 | *.a 14 | *.bin 15 | *.bz2 16 | *.c.[012]*.* 17 | *.dtb 18 | *.dtb.S 19 | *.dwo 20 | *.elf 21 | *.gcno 22 | *.gz 23 | *.i 24 | *.ko 25 | *.ll 26 | *.lst 27 | *.lz4 28 | *.lzma 29 | *.lzo 30 | *.mod.c 31 | *.o 32 | *.o.* 33 | *.order 34 | #*.patch 35 | *.s 36 | *.so 37 | *.so.dbg 38 | *.su 39 | *.symtypes 40 | *.tar 41 | *.xz 42 | Module.symvers 43 | modules.builtin 44 | 45 | # 46 | # Top-level generic files 47 | # 48 | /tags 49 | /TAGS 50 | /linux 51 | /vmlinux 52 | /vmlinux.32 53 | /vmlinux-gdb.py 54 | /vmlinuz 55 | /System.map 56 | /Module.markers 57 | 58 | # 59 | # RPM spec file (make rpm-pkg) 60 | # 61 | /*.spec 62 | 63 | # 64 | # Debian directory (make deb-pkg) 65 | # 66 | #/debian/ 67 | 68 | # 69 | # Snap directory (make snap-pkg) 70 | # 71 | /snap/ 72 | 73 | # 74 | # tar directory (make tar*-pkg) 75 | # 76 | /tar-install/ 77 | 78 | # 79 | # git files that we don't want to ignore even if they are dot-files 80 | # 81 | !.gitignore 82 | !.mailmap 83 | !.cocciconfig 84 | 85 | # 86 | # Generated include files 87 | # 88 | include/config 89 | include/generated 90 | arch/*/include/generated 91 | 92 | # stgit generated dirs 93 | patches-* 94 | 95 | # quilt's files 96 | patches 97 | series 98 | 99 | # cscope files 100 | cscope.* 101 | ncscope.* 102 | 103 | # gnu global files 104 | GPATH 105 | GRTAGS 106 | GSYMS 107 | GTAGS 108 | 109 | # id-utils files 110 | ID 111 | 112 | *.orig 113 | *~ 114 | \#*# 115 | 116 | # 117 | # Leavings from module signing 118 | # 119 | extra_certificates 120 | signing_key.pem 121 | signing_key.priv 122 | signing_key.x509 123 | x509.genkey 124 | 125 | # Kconfig presets 126 | all.config 127 | 128 | # Kdevelop4 129 | *.kdev4 130 | -------------------------------------------------------------------------------- /5.4.0.104/README: -------------------------------------------------------------------------------- 1 | README for bringup of NvSciC2c driver on x86. It covers: 2 | 1. Platform. 3 | 2. Build of NvSciC2c driver and its dependencies. 4 | 3. NvSciC2c driver execution on x86. 5 | 4. Contact. 6 | 7 | 8 | 1. Platform 9 | NvSciC2c is Nvidia proprietary Chip-to-Chip SW communication protocol which 10 | allows exchange of data over PCIe Root-Port <-> Endpoint in the folliowing manner 11 | a. Low latency small packets via CPU transfers. 12 | b. Bulk data packets via DMA transfers. 13 | 14 | Supported HW platforms: 15 | a. 3rd generation scalable Intel Xeon with NVIDIA DRIVE-A100 Automotive GPU (optional) 16 | 17 | Supported PCIe reference topology: 18 | a. PCIe re-timers card P3722 is inserted into a X16 PCIe slot on Intel Xeon board 19 | b. Interconnection Board P3713 is present in DRIVE AGX Devkit 20 | c. miniSAS Port-B of P3713 is connected to miniSAS Port D of P3722 with a PCIe miniSAS cable 21 | 22 | Supported SW Release and Configurations(s): 23 | a. Ubuntu 20.04 LTS installation with kernel=5.4.0-104-generic 24 | b. NVIDIA GPU Driver with version >= 510.73 25 | c. The IOMMU should be enabled by having intel_iommu=on in kernel command 26 | 27 | 2. Build NvSciC2c driver and its dependences. 28 | 29 | 2.1 Build and Install kernel 5.4.0-104-generic with IOMMU_DMA enabled: 30 | NvSciC2c driver uses iommu APIs to allocate iova, which is used for programming of DMA. 31 | The IOMMU_DMA config that supports IOMMU agnostic DMA-mapping layer is currently diabled by default in 32 | K5.4 and it needs to be enabled. 33 | 34 | Steps to build Kernel 5.4: BuildingYourOwnKernel(https://wiki.ubuntu.com/Kernel/BuildYourOwnKernel) 35 | a. Obtain the source for an Ubuntu release: 36 | git clone git://kernel.ubuntu.com/ubuntu/ubuntu-focal.git 37 | b. checkout the Ubuntu 5.4.0-104 tag 38 | cd ubuntu-focal 39 | git checkout tags/Ubuntu-5.4.0-104.118 -b your_branch 40 | c. enable IOMMU_DMA by directly editing drivers/iommu/Kconfig and marking it "default y" 41 | for the IOMMU_DMA section 42 | d. Building the kernel 43 | LANG=C fakeroot debian/rules clean 44 | LANG=C fakeroot debian/rules binary-headers binary-generic 45 | e. once built, install kernel deb files 46 | sudo dpkg -i ../linux*5.4.0*.deb 47 | g. set newly build image as default in grub 48 | sudo vi /etc/default/grub 49 | change GRUB_DEFAULT = "Advanced options for Ubuntu>Ubuntu, with Linux 5.4.0-104-generic" 50 | set GRUB_CMDLINE_LINUX_DEFAULT="intel_iommu=on" 51 | sudo update-grub 52 | sudo reboot 53 | On reboot, uname -r should show reflect the kernel version as: 54 | 5.4.0-104-generic 55 | 56 | 2.2 Steps to build NvSciC2c driver on top of K5.4 cloned in step 2.1: 57 | a. Switch to the same branch created in step 2.1 or create a new branch 58 | cd ubuntu-focal/drivers/misc/ 59 | git checkout your_branch 60 | b. Apply the patches for NvSciC2c driver 61 | git apply nvscic2c_Ubuntu_5.4.0-104-generic 62 | c. Verify Module.symvers is present in /usr/src/nvidia-510.73, otherwise generate it manualy 63 | cd /usr/src/nvidia-510.73 64 | sudo make 65 | d. Build NvSciC2c driver as an out-of-tree module 66 | cd drivers/misc/nvscic2c-pcie 67 | sudo make 68 | sudo make install 69 | e. To use the NvSciC2c driver without dGPU, build it using 70 | sudo make -j DISABLE_GPU=1 71 | sudo make install 72 | 73 | 3. NvSciC2c driver execution: 74 | 75 | 3.1 Detect AGX Orin as PCIe Endpoint 76 | 77 | For NvSciC2c driver to be used on X86, the DRIVE AGX Orin should be detected as valid PCIe endpoint. 78 | The details for Orin setup are captured in the DRIVE OS release documentation 79 | 80 | 3.2 Inserting NvSciC2c driver 81 | Once the NvSciC2c driver is compiled by following the steps 2.2[a-d], it can be inserted into the kernel. 82 | sudo modprobe nvscic2c-pcie-epc 83 | 84 | 4. Contact: 85 | Evan Shi 86 | Chirantan 87 | Arihant Jejani 88 | Deepak Kumar Badgaiyan 89 | Bob Johnston 90 | -------------------------------------------------------------------------------- /4.15.0.45/README: -------------------------------------------------------------------------------- 1 | README for bringup of NvSciC2C on x86. It covers: 2 | 1. Platform. 3 | 2. Reserving contiguous memory(=64MB) for PCIe shared memory backing. 4 | 3. NTB changes and auto-load of it's LKM(s). 5 | 4. Setting up Kernel sources (on linux host machine) for NvSciC2C LKM build. 6 | 5. Steps to insert NvSciC2C LKM on x86. 7 | 6. Contact. 8 | 9 | 10 | 1. Platform 11 | 12 | NvSciC2C is Nvidia Propietary Chip-to-Chip SW communication protocol which 13 | allows PCIe Root-Ports exchange 14 | a. Low latency small packets via CPU transfers. 15 | b. Bulk data packets via DMA transfers. 16 | via NTB PCIe EndPoint over PCIe interface. 17 | 18 | Supported platform is E3550 B01/B03 DDPX with Intel Xeon COMEX board hosted. 19 | Supported PCIe topology is Xavier-A(Tegra) connected via PCIe-NT 20 | (Non-Transparent Bridging)vEP to Intel Xeon(x86) COMEX. 21 | 22 | Both Xavier-A(QNX) and Intel Xeon are supposed to have IOMMU=OFF for their 23 | respective PCIe devices/EPs. 24 | 25 | Supported SW Release(s): 26 | a. Xavier-A: 5.1.3.0 DRIVE OS QNX with pct=d-av-q. 27 | b. Intel Xeon: 18.04 LTS Ubuntu installation, kernel=4.15.0-45.48 28 | (a.k.a 4.15.0-45-generic) 29 | 30 | Ensure Intel Xeon has Ethernet Connectivity to move NvSciC2C(and dependent) 31 | LKMs from host m/c to Intel Xeon. 32 | 33 | Refer to Internal Confluence Page on E3550+ADLINK setup page or External 34 | release notes for DDPX board bringup with Intel Xeon X86 COMEX. 35 | 36 | 37 | 2. Reserving 'reserved memory' contiguous memory region 38 | 39 | For Intel Xeon's NvSciC2C SW to receive bulk data from Xavier-A over PCIe and 40 | with Intel IOMMU=OFF, large contiguous memory needs be exposed via NT-EP 41 | direct memory window. This allows Xavier-A to pass it's produced bulk frames 42 | (from camera, GFX, etc.) to Intel Xeon x86. To reserve such memory region, 43 | use kernel boot-args: 'memmap'. 44 | 45 | Physical address of this reserved memory region must be aligned to size of 46 | the region. For e.g: if 64MB of reserved memory is required, it can be marked 47 | reserved at 0x88000000(if not marked reserved already, else some other 48 | address keeping the address alignment needs in-check). 49 | 50 | NvSciC2C requires exactly 64MB block. This fixed address and size is passed 51 | as module parameter to nvscic2c.ko (mentioned later.) 52 | 53 | Steps: 54 | a. Check the physical memory available for reservation 55 | comex@ddpx-xeon:~$ dmesg | grep BIOS-e820 56 | 57 | b. Reserve 64MB memory at 0x88000000 (if seen usable in (2)(a)) 58 | comex@ddpx-xeon:~$ sudo vi /etc/default/grub 59 | - GRUB_CMDLINE_LINUX="console=ttyS4,115200 console=tty0" 60 | + ##GRUB_CMDLINE_LINUX="console=ttyS4,115200 console=tty0" 61 | + GRUB_CMDLINE_LINUX="console=ttyS4,115200 console=tty0 memmap=64M\\\$0x88000000" 62 | 63 | c. Update the memmap option in current grub configuration 64 | comex@ddpx-xeon:~$ sudo update-grub2 65 | 66 | d. See the memmap option reflected in grub/boot 67 | comex@ddpx-xeon:~$ vi /boot/grub/grub.cfg 68 | [linux /boot/vmlinuz-4.15.0-39-generic root=/dev/sda2 ro console=ttyS4,115200 console=tty0 memmap=64M\$0x88000000 3] 69 | comex@ddpx-xeon:~$ sudo reboot 70 | 71 | e. Once rebooted, check kernel command line, physical memory should now be 72 | marked persistent/reserved. 73 | 74 | 75 | 3. NTB changes and auto-load of it's LKM(s). 76 | 77 | a. For NvSciC2C, also a NTB client, memory window size(64MB) requirements 78 | exceed the defaults NTB module support(2MB). Currently, this window size 79 | is set to 64MB as explained in (2). We pass this size as LKM parameter 80 | to NTB switchtec module while loading it manually. We recommend, users 81 | to disable auto-load of NTB LKM(s): switchtec.ko, ntb.ko and 82 | ntb_hw_switchtec.ko. How to manually load them is mentioned later. 83 | 84 | b. Upstream NTB module, uses 1 MSI-X for all 28 supported DB, 1 MSI-X for 85 | 4 MSG registers. To conclusively deduce the DB index triggered by remote 86 | we check each bit of the DB register each time resetting the set DB index 87 | to zero. We have introduced a change in NTB upstream LKM under Kconfig: 88 | NTB_LINK_MGMT, where we request for overall 32 MSI-X vectors to Intel 89 | Xeon's PCIe sub-system and assign each of 28 DB and 4 MSG register a 90 | distinct MSI-X vec. 91 | 92 | 93 | 4. Setting up kernel sources for NvSciC2C LKM build. 94 | 95 | NvSciC2C SW has two components: Loadable Kernel Module(LKM) and User-Space 96 | Library(USL). 97 | 98 | To build NvSciC2C LKM, on top of Ubuntu bionic kernel clone, we must apply 99 | the NvSciC2C and dependent NTB Link MGMT patch and subsequently follow 100 | standard process of BuildingYourOwnKernel ratified by Canonical. 101 | 102 | The following steps can be done either on Intel Xeon or on Ubuntu host m/c. 103 | 104 | Steps: 105 | a. #### Clone the git repo. 106 | comex@ddpx-xeon:~$ git clone git://kernel.ubuntu.com/ubuntu/ubuntu-bionic.git 107 | comex@ddpx-xeon:~$ cd ubuntu-bionic/ 108 | comex@ddpx-xeon:~$ git checkout -b temp Ubuntu-4.15.0-45.48 109 | 110 | b. ## Compile the objects into a separate directory. 111 | comex@ddpx-xeon:~$ export KBUILD_OUTPUT=`pwd`/out 112 | 113 | c. ### Fix a peculiar build error while making the kernel 114 | comex@ddpx-xeon:~$ cp debian/scripts/retpoline-extract-one scripts/ubuntu-retpoline-extract-one 115 | 116 | d. ### Apply the patches for nvscic2c LKM and it’s dependency 117 | comex@ddpx-xeon:~$ git apply link_mgmt-4.15.0-45.48.patch 118 | comex@ddpx-xeon:~$ git apply nvscic2c-4.15.0-45.48.patch 119 | 120 | e. ### Enable the NvSciC2C modules as ‘m’ via menuconfig. This will also set the Kconfig: NTB_LINK_MGMT to ‘y’ 121 | comex@ddpx-xeon:~$ make SUBLEVEL=0 EXTRAVERSION=-45-generic menuconfig 122 | 123 | f. ### Compile the kernel source. Set LOCALVERSION=”” to avoid magic version mismatch issue while inserting NvSciC2C and NTB LKM(s) 124 | comex@ddpx-xeon:~$ make SUBLEVEL=0 EXTRAVERSION=-45-generic -j`getconf _NPROCESSORS_ONLN` LOCALVERSION="" 125 | 126 | 127 | 5. Inserting NvSciC2C LKM manually. 128 | 129 | Once the NvSciC2C LKM and NTB LKM(s) with NTB_LINK_MGMT='y' is compiled 130 | following steps 3[a-f], we can insert these modules manually. If built 131 | on host m/c, these must be copied to Intel Xeon file-system. These must 132 | be inserted in the order listed here. 133 | 134 | The NvSciC2C LKM params must be same as the reserved memory region 135 | credentials reserved in step (2). 136 | 137 | One can optionally add dyndbg=+p to each of these to have a verbose output. 138 | 139 | comex@ddpx-xeon:~$ sudo insmod ntb.ko 140 | comex@ddpx-xeon:~$ sudo insmod switchtec.ko 141 | comex@ddpx-xeon:~$ sudo insmod ntb_hw_switchtec.ko max_mw_size=0x04000000 142 | comex@ddpx-xeon:~$ sudo insmod nvscic2c.ko fixed_mw_addr=0x88000000 fixed_mw_size=0x04000000 143 | 144 | nvscic2c.ko can take as long as 8sec to load. This is because we are 145 | issuing memset() of reserved memory region in NO_CACHE mode. 146 | 147 | 5. Contact: 148 | Arihant Jejani 149 | Bob Johnston 150 | Deepak Kumar Badgaiyan 151 | Tushar Padlikar 152 | -------------------------------------------------------------------------------- /4.18.0.22/README: -------------------------------------------------------------------------------- 1 | README for bringup of NvSciC2C on x86. It covers: 2 | 1. Platform. 3 | 2. Reserving contiguous memory(=256MB) for PCIe shared memory backing. 4 | 3. NTB changes and auto-load of it's LKM(s). 5 | 4. Setting up Kernel sources (on linux host machine) for NvSciC2C LKM build. 6 | 5. Steps to insert NvSciC2C LKM on x86. 7 | 6. Contact. 8 | 9 | 10 | 1. Platform 11 | 12 | NvSciC2C is Nvidia proprietary Chip-to-Chip SW communication protocol which 13 | allows PCIe Root-Ports exchange 14 | a. Low latency small packets via CPU transfers. 15 | b. Bulk data packets via DMA transfers. 16 | via NTB PCIe EndPoint over PCIe interface. 17 | 18 | Supported platform is E3550 B01/B03 DDPX with Intel Xeon COMEX board hosted. 19 | Supported PCIe topology is Xavier-A(Tegra) connected via PCIe-NT 20 | (Non-Transparent Bridging)vEP to Intel Xeon(x86) COMEX. 21 | 22 | Both Xavier-A and Intel Xeon are supposed to have IOMMU=OFF for their 23 | respective PCIe devices/EPs and are also considered to be I/O Coherent 24 | for PCIe devices. 25 | 26 | Supported SW Release(s): 27 | a. Xavier-A: 5.1.9.0 DRIVE OS with pct=C2C for DAV-L, DAV-Q. 28 | b. Intel Xeon: 18.04 LTS Ubuntu installation, kernel=Ubuntu-hwe-4.18.0-22.23~18.04.1 29 | (a.k.a 4.18.0-22-generic) 30 | 31 | Ensure Intel Xeon has Ethernet Connectivity to move NvSciC2C(and dependent) 32 | LKMs from host m/c to Intel Xeon. 33 | 34 | Refer to Internal Confluence Page on E3550+ADLINK setup page or External 35 | release notes for DDPX board bringup with Intel Xeon X86 COMEX. 36 | 37 | 38 | 2. Reserving 'reserved memory' contiguous memory region 39 | 40 | For Intel Xeon's NvSciC2C SW to receive bulk data from Xavier-A over PCIe and 41 | with Intel IOMMU=OFF, large contiguous memory needs be exposed via NT-EP 42 | direct memory window. This allows Xavier-A to pass it's produced bulk frames 43 | (from camera, GFX, etc.) to Intel Xeon x86. To reserve such memory region, 44 | use kernel boot-args: 'memmap'. 45 | 46 | Physical address of this reserved memory region must be aligned to size of 47 | the region. For e.g: if 256MB of reserved memory is required, it can be marked 48 | reserved at 0x90000000(if not marked reserved already, else some other 49 | address keeping the address alignment needs in-check). 50 | 51 | NvSciC2C requires exactly 256MB block. This fixed address and size is passed 52 | as module parameter to nvscic2c.ko (mentioned later.) 53 | 54 | Steps: 55 | a. Check the physical memory available for reservation 56 | comex@ddpx-xeon:~$ dmesg | grep BIOS-e820 57 | 58 | b. Reserve 256MB memory at 0x90000000 (if seen usable in (2)(a)) 59 | comex@ddpx-xeon:~$ sudo vi /etc/default/grub 60 | - GRUB_CMDLINE_LINUX="console=ttyS4,115200 console=tty0" 61 | + ##GRUB_CMDLINE_LINUX="console=ttyS4,115200 console=tty0" 62 | + GRUB_CMDLINE_LINUX="console=ttyS4,115200 console=tty0 memmap=256M\\\$0x90000000" 63 | 64 | c. Update the memmap option in current grub configuration 65 | comex@ddpx-xeon:~$ sudo update-grub2 66 | 67 | d. See the memmap option reflected in grub/boot 68 | comex@ddpx-xeon:~$ vi /boot/grub/grub.cfg 69 | [linux /boot/vmlinuz-4.18.0-22-generic root=/dev/sda2 ro console=ttyS4,115200 console=tty0 memmap=256M\$0x90000000 3] 70 | comex@ddpx-xeon:~$ sudo reboot 71 | 72 | e. Once rebooted, check kernel command line, physical memory should now be 73 | marked persistent/reserved. 74 | 75 | 76 | 3. NTB changes and auto-load of it's LKM(s). 77 | 78 | a. For NvSciC2C, also a NTB client, memory window size(256MB) requirements 79 | exceed the defaults NTB module support(2MB). Currently, this window size 80 | is set to 256MB as explained in (2). We pass this size as LKM parameter 81 | to NTB switchtec module while loading it manually. We recommend, users 82 | to disable auto-load of NTB LKM(s): switchtec.ko, ntb.ko and 83 | ntb_hw_switchtec.ko. How to manually load them is mentioned later. 84 | 85 | comex@ddpx-xeon:~$ sudo mv /lib/modules/4.18.0-22-generic/kernel/drivers/ntb/ntb.ko ~/ntb.ko_bkp 86 | comex@ddpx-xeon:~$ sudo mv /lib/modules/4.18.0-22-generic/kernel/drivers/ntb/hw/mscc/ntb_hw_switchtec.ko ~/ntb_hw_switchtec.ko_bkp 87 | comex@ddpx-xeon:~$ sudo mv /lib/modules/4.18.0-22-generic/kernel/drivers/pci/switch/switchtec.ko ~/switchtec.ko_bkp 88 | comex@ddpx-xeon:~$ sudo reboot 89 | 90 | b. Upstream NTB module, uses 1 MSI-X for all 28 supported DB, 1 MSI-X for 91 | 4 MSG registers. To conclusively deduce the DB index triggered by remote 92 | we check each bit of the DB register each time resetting the set DB index 93 | to zero. We have introduced a change in NTB upstream LKM under Kconfig: 94 | NTB_LINK_MGMT, where we request for overall 32 MSI-X vectors to Intel 95 | Xeon's PCIe sub-system and assign each of 28 DB and 4 MSG register a 96 | distinct MSI-X vec. 97 | 98 | 99 | 4. Setting up kernel sources for NvSciC2C LKM build. 100 | 101 | Host m/c to compile NvSciC2C LKM should be on Linux Ubuntu 18.04.2LTS with 4.18.0-22-generic #23~18.04.1-Ubuntu. 102 | 103 | For installing Ubuntu-hwe-4.18.0-22.23_18.04.1 on host m/c: 104 | comex@ddpx-xeon:~$ wget https://launchpad.net/~canonical-kernel-security-team/+archive/ubuntu/ppa/+build/16907417/+files/linux-image-unsigned-4.18.0-22-generic_4.18.0-22.23~18.04.1_amd64.deb 105 | comex@ddpx-xeon:~$ wget https://launchpad.net/~canonical-kernel-security-team/+archive/ubuntu/ppa/+build/16907417/+files/linux-headers-4.18.0-22_4.18.0-22.23~18.04.1_all.deb 106 | comex@ddpx-xeon:~$ wget https://launchpad.net/~canonical-kernel-security-team/+archive/ubuntu/ppa/+build/16907417/+files/linux-headers-4.18.0-22-generic_4.18.0-22.23~18.04.1_amd64.deb 107 | comex@ddpx-xeon:~$ wget https://launchpad.net/~canonical-kernel-security-team/+archive/ubuntu/ppa/+build/16907417/+files/linux-modules-4.18.0-22-generic_4.18.0-22.23~18.04.1_amd64.deb 108 | comex@ddpx-xeon:~$ wget https://launchpad.net/~canonical-kernel-security-team/+archive/ubuntu/ppa/+build/16907417/+files/linux-modules-extra-4.18.0-22-generic_4.18.0-22.23~18.04.1_amd64.deb 109 | 110 | Install the above packages: 111 | comex@ddpx-xeon:~$ sudo dpkg -i linux-*.deb 112 | comex@ddpx-xeon:~$ sudo update-initramfs -u 113 | 114 | Change grub to boot with 4.18.0-22.23 kernel: 115 | comex@ddpx-xeon:~$ sudo vi /etc/default/grub 116 | -- GRUB_DEFAULT=0 117 | ++ GRUB_DEFAULT="Advanced options for Ubuntu>Ubuntu, with Linux 4.18.0-22-generic" 118 | comex@ddpx-xeon:~$ sudo update-grub 119 | comex@ddpx-xeon:~$ sudo reboot 120 | 121 | On reboot, uname -a should show reflect the kernel version as: 122 | comex@ddpx-xeon:~$ uname -a 123 | Linux ddpx-xeon 4.18.0-22-generic #23~18.04.1-Ubuntu SMP Thu Jun 6 08:37:25 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux 124 | 125 | Please make sure to have these installed on host m/c for building LKM modules and execute Step (3)(a): 126 | comex@ddpx-xeon:~$ sudo apt-get install libncurses-dev bison flex clang libssl-dev gawk libudev-dev libelf-dev 127 | 128 | NvSciC2C SW has two components: Loadable Kernel Module(LKM) and User-Space Library(USL). 129 | 130 | To build NvSciC2C LKM, on top of Ubuntu bionic kernel clone, we must apply 131 | the NvSciC2C and dependent NTB Link MGMT patch and subsequently follow 132 | standard process of BuildingYourOwnKernel ratified by Canonical. 133 | 134 | The following steps can be done either on Intel Xeon or on Ubuntu host m/c. 135 | 136 | Steps: 137 | a. #### Clone the git repo. 138 | comex@ddpx-xeon:~$ git clone git://kernel.ubuntu.com/ubuntu/ubuntu-bionic.git 139 | comex@ddpx-xeon:~$ cd ubuntu-bionic/ 140 | comex@ddpx-xeon:~$ git checkout -b temp Ubuntu-hwe-4.18.0-22.23_18.04.1 141 | 142 | b. ## Compile the objects into a separate directory. 143 | comex@ddpx-xeon:~$ export KBUILD_OUTPUT=`pwd`/out 144 | 145 | c. ### Fix a peculiar build error while making the kernel 146 | comex@ddpx-xeon:~$ cp debian/scripts/retpoline-extract-one scripts/ubuntu-retpoline-extract-one 147 | 148 | d. ### Apply the patches for nvscic2c LKM and it’s dependency 149 | comex@ddpx-xeon:~$ git apply ntb_link_mgmt_Ubuntu-hwe-4.18.0-22.23.patch 150 | comex@ddpx-xeon:~$ git apply nvscic2c_Ubuntu-hwe-4.18.0-22.23.patch 151 | 152 | e. ### Enable the NvSciC2C modules as ‘m’ via menuconfig. This will also set the Kconfig: NTB_LINK_MGMT to ‘y’ 153 | comex@ddpx-xeon:~$ make SUBLEVEL=0 EXTRAVERSION=-22-generic menuconfig 154 | 155 | f. ### Compile the kernel source. Set LOCALVERSION=”” to avoid magic version mismatch issue while inserting NvSciC2C and NTB LKM(s) 156 | comex@ddpx-xeon:~$ make SUBLEVEL=0 EXTRAVERSION=-22-generic -j`getconf _NPROCESSORS_ONLN` LOCALVERSION="" 157 | 158 | 159 | 5. Inserting NvSciC2C LKM manually. 160 | 161 | Once the NvSciC2C LKM and NTB LKM(s) with NTB_LINK_MGMT='y' is compiled 162 | following steps 3[a-f], we can insert these modules manually. If built 163 | on host m/c, these must be copied to Intel Xeon file-system. These must 164 | be inserted in the order listed here. 165 | 166 | The NvSciC2C LKM params must be same as the reserved memory region 167 | credentials reserved in step (2). 168 | 169 | One can optionally add dyndbg=+p to each of these to have a verbose output. 170 | 171 | comex@ddpx-xeon:~$ sudo insmod $KBUILD_OUTPUT/drivers/ntb/ntb.ko 172 | comex@ddpx-xeon:~$ sudo insmod $KBUILD_OUTPUT/drivers/pci/switch/switchtec.ko 173 | comex@ddpx-xeon:~$ sudo insmod $KBUILD_OUTPUT/drivers/ntb/hw/mscc/ntb_hw_switchtec.ko max_mw_size=0x10000000 174 | comex@ddpx-xeon:~$ sudo insmod $KBUILD_OUTPUT/drivers/misc/nvscic2c/nvscic2c.ko fixed_mw_addr=0x90000000 fixed_mw_size=0x10000000 175 | 176 | 5. Contact: 177 | Arihant Jejani 178 | Bob Johnston 179 | Deepak Kumar Badgaiyan 180 | Tushar Padlikar 181 | -------------------------------------------------------------------------------- /4.15.0.45/link_mgmt-4.15.0-45.48.patch: -------------------------------------------------------------------------------- 1 | diff --git a/drivers/ntb/Kconfig b/drivers/ntb/Kconfig 2 | index 95944e5..6803955 100644 3 | --- a/drivers/ntb/Kconfig 4 | +++ b/drivers/ntb/Kconfig 5 | @@ -12,6 +12,15 @@ menuconfig NTB 6 | 7 | if NTB 8 | 9 | +config NTB_LINK_MGMT 10 | + bool "Link management to avoid PCIe Rd for link detection" 11 | + help 12 | + This is an additional option that is based on using 1 MSI-X per NTB DB 13 | + and moves the NTB link detection to use purely MSI rather than making 14 | + CPU read of MAGIC bytes over PCIe. 15 | + 16 | + If unsure, say N. 17 | + 18 | source "drivers/ntb/hw/Kconfig" 19 | 20 | source "drivers/ntb/test/Kconfig" 21 | diff --git a/drivers/ntb/hw/mscc/ntb_hw_switchtec.c b/drivers/ntb/hw/mscc/ntb_hw_switchtec.c 22 | index afe8ed6..c666297 100644 23 | --- a/drivers/ntb/hw/mscc/ntb_hw_switchtec.c 24 | +++ b/drivers/ntb/hw/mscc/ntb_hw_switchtec.c 25 | @@ -67,6 +67,30 @@ static inline void _iowrite64(u64 val, void __iomem *mmio) 26 | #define SWITCHTEC_NTB_MAGIC 0x45CC0001 27 | #define MAX_MWS 128 28 | 29 | +#ifdef CONFIG_NTB_LINK_MGMT 30 | +/* in-case we wanted to supported DB less than default(28). 31 | + * This value should never exceed set MSB > 28. Because we do not 32 | + * have DT support, either use this or make it module parameter. 33 | + */ 34 | +#define ALLOWED_DB (0x0FFFFFFF) 35 | + 36 | +/* assign distinct msi-x vectors to each db/msg.*/ 37 | +enum { 38 | + LINK_DOWN_VEC = 0, 39 | + LINK_UP_VEC = 1, 40 | + PARTITION_EVENT_VEC = 2, 41 | + DB_START_VEC = 3, 42 | + DB_END_VEC = 30, 43 | + UNUSED_VEC = 31, 44 | + MAXIMUM_VEC = 32, 45 | +}; 46 | + 47 | +struct switchtec_irq { 48 | + int irq_num; 49 | + bool isr_attached; 50 | +}; 51 | +#endif //CONFIG_NTB_LINK_MGMT 52 | + 53 | struct shared_mw { 54 | u32 magic; 55 | u32 link_sta; 56 | @@ -85,8 +109,12 @@ struct switchtec_ntb { 57 | int self_partition; 58 | int peer_partition; 59 | 60 | +#ifdef CONFIG_NTB_LINK_MGMT 61 | + struct switchtec_irq irqs[MAXIMUM_VEC]; 62 | +#else 63 | int doorbell_irq; 64 | int message_irq; 65 | +#endif 66 | 67 | struct ntb_info_regs __iomem *mmio_ntb; 68 | struct ntb_ctrl_regs __iomem *mmio_ctrl; 69 | @@ -470,6 +498,15 @@ enum { 70 | MSG_CHECK_LINK = 3, 71 | }; 72 | 73 | +#ifdef CONFIG_NTB_LINK_MGMT 74 | +enum { 75 | + MSG_REG_1 = 0, 76 | + MSG_REG_2 = 1, 77 | + MSG_REG_3 = 2, 78 | + MSG_REG_4 = 3, 79 | +}; 80 | +#endif 81 | + 82 | static void switchtec_ntb_check_link(struct switchtec_ntb *sndev) 83 | { 84 | int link_sta; 85 | @@ -489,7 +526,15 @@ static void switchtec_ntb_check_link(struct switchtec_ntb *sndev) 86 | switchtec_ntb_set_link_speed(sndev); 87 | 88 | if (link_sta != old) { 89 | +#ifdef CONFIG_NTB_LINK_MGMT 90 | + if (link_sta) { 91 | + switchtec_ntb_send_msg(sndev, MSG_REG_1, MSG_LINK_UP); 92 | + } else { 93 | + switchtec_ntb_send_msg(sndev, MSG_REG_2, MSG_LINK_DOWN); 94 | + } 95 | +#else 96 | switchtec_ntb_send_msg(sndev, LINK_MESSAGE, MSG_CHECK_LINK); 97 | +#endif 98 | ntb_link_event(&sndev->ntb); 99 | dev_info(&sndev->stdev->dev, "ntb link %s", 100 | link_sta ? "up" : "down"); 101 | @@ -526,9 +571,13 @@ static int switchtec_ntb_link_enable(struct ntb_dev *ntb, 102 | dev_dbg(&sndev->stdev->dev, "enabling link"); 103 | 104 | sndev->self_shared->link_sta = 1; 105 | +#ifdef CONFIG_NTB_LINK_MGMT 106 | + switchtec_ntb_send_msg(sndev, MSG_REG_1, MSG_LINK_UP); 107 | +#else 108 | switchtec_ntb_send_msg(sndev, LINK_MESSAGE, MSG_LINK_UP); 109 | 110 | switchtec_ntb_check_link(sndev); 111 | +#endif 112 | 113 | return 0; 114 | } 115 | @@ -540,10 +589,13 @@ static int switchtec_ntb_link_disable(struct ntb_dev *ntb) 116 | dev_dbg(&sndev->stdev->dev, "disabling link"); 117 | 118 | sndev->self_shared->link_sta = 0; 119 | +#ifdef CONFIG_NTB_LINK_MGMT 120 | + switchtec_ntb_send_msg(sndev, MSG_REG_2, MSG_LINK_DOWN); 121 | +#else 122 | switchtec_ntb_send_msg(sndev, LINK_MESSAGE, MSG_LINK_UP); 123 | 124 | switchtec_ntb_check_link(sndev); 125 | - 126 | +#endif 127 | return 0; 128 | } 129 | 130 | @@ -556,16 +608,25 @@ static u64 switchtec_ntb_db_valid_mask(struct ntb_dev *ntb) 131 | 132 | static int switchtec_ntb_db_vector_count(struct ntb_dev *ntb) 133 | { 134 | +#ifdef CONFIG_NTB_LINK_MGMT 135 | + struct switchtec_ntb *sndev = ntb_sndev(ntb); 136 | + return fls(sndev->db_valid_mask); 137 | +#else 138 | return 1; 139 | +#endif 140 | } 141 | 142 | static u64 switchtec_ntb_db_vector_mask(struct ntb_dev *ntb, int db_vector) 143 | { 144 | struct switchtec_ntb *sndev = ntb_sndev(ntb); 145 | 146 | +#ifdef CONFIG_NTB_LINK_MGMT 147 | + if (db_vector < DB_START_VEC || db_vector > DB_END_VEC) 148 | + return 0; 149 | +#else 150 | if (db_vector < 0 || db_vector > 1) 151 | return 0; 152 | - 153 | +#endif 154 | return sndev->db_valid_mask; 155 | } 156 | 157 | @@ -855,6 +916,10 @@ static void switchtec_ntb_init_db(struct switchtec_ntb *sndev) 158 | { 159 | sndev->db_valid_mask = 0x0FFFFFFF; 160 | 161 | +#ifdef CONFIG_NTB_LINK_MGMT 162 | + sndev->db_valid_mask = ALLOWED_DB; 163 | +#endif 164 | + 165 | if (sndev->self_partition < sndev->peer_partition) { 166 | sndev->db_shift = 0; 167 | sndev->db_peer_shift = 32; 168 | @@ -940,7 +1005,9 @@ static void switchtec_ntb_init_shared(struct switchtec_ntb *sndev) 169 | int i; 170 | 171 | memset(sndev->self_shared, 0, LUT_SIZE); 172 | +#ifndef CONFIG_NTB_LINK_MGMT 173 | sndev->self_shared->magic = SWITCHTEC_NTB_MAGIC; 174 | +#endif 175 | sndev->self_shared->partition_id = sndev->stdev->partition; 176 | 177 | for (i = 0; i < sndev->nr_direct_mw; i++) { 178 | @@ -1036,6 +1103,159 @@ static void switchtec_ntb_deinit_shared_mw(struct switchtec_ntb *sndev) 179 | sndev->self_shared_dma); 180 | } 181 | 182 | + 183 | +#ifdef CONFIG_NTB_LINK_MGMT 184 | +static irqreturn_t switchtec_isr(int irq, void *dev) 185 | +{ 186 | + struct switchtec_ntb *sndev = dev; 187 | + int i = 0; 188 | + 189 | + for (i = DB_START_VEC; i <= DB_END_VEC; i++) { 190 | + if (irq == sndev->irqs[i].irq_num) { 191 | + dev_dbg(&sndev->stdev->dev, 192 | + "doorbell: (%d)\n", (i - DB_START_VEC)); 193 | + ntb_db_event(&sndev->ntb, (i - DB_START_VEC)); 194 | + return IRQ_HANDLED; 195 | + } 196 | + } 197 | + 198 | + /* we do not read message register to know the message type: 199 | + * UP/DOWN/FORCE_DOWN as we avoid reading over PCIe by CPU. 200 | + * Trust the remote to use the correct message register. 201 | + * FORCE_LINK_DOWN must not use same register as LINK_UP. 202 | + */ 203 | + 204 | + if (irq == sndev->irqs[LINK_UP_VEC].irq_num) { 205 | + sndev->link_is_up = 1; 206 | + dev_dbg(&sndev->stdev->dev, "ntb link up"); 207 | + iowrite8(1, &sndev->mmio_self_dbmsg->imsg[MSG_REG_1].status); 208 | + switchtec_ntb_set_link_speed(sndev); 209 | + ntb_link_event(&sndev->ntb); 210 | + 211 | + return IRQ_HANDLED; 212 | + } 213 | + 214 | + if (irq == sndev->irqs[LINK_DOWN_VEC].irq_num) { 215 | + sndev->link_is_up = 0; 216 | + dev_dbg(&sndev->stdev->dev, "ntb link down"); 217 | + iowrite8(1, &sndev->mmio_self_dbmsg->imsg[MSG_REG_2].status); 218 | + switchtec_ntb_set_link_speed(sndev); 219 | + ntb_link_event(&sndev->ntb); 220 | + 221 | + return IRQ_HANDLED; 222 | + } 223 | + 224 | + return IRQ_HANDLED; 225 | +} 226 | + 227 | +static void switchtec_ntb_deinit_db_msg_irq(struct switchtec_ntb *sndev) 228 | +{ 229 | + int i = 0; 230 | + 231 | + for (i = LINK_DOWN_VEC; i < MAXIMUM_VEC; i++) { 232 | + if (sndev->irqs[i].isr_attached) { 233 | + free_irq(sndev->irqs[i].irq_num, sndev); 234 | + sndev->irqs[i].isr_attached = false; 235 | + sndev->irqs[i].irq_num = 0; 236 | + } 237 | + } 238 | +} 239 | + 240 | +static int switchtec_ntb_init_db_msg_irq(struct switchtec_ntb *sndev) 241 | +{ 242 | + int i; 243 | + int rc; 244 | + int event_irq; 245 | + uint32_t bit; 246 | + int idb_vecs = sizeof(sndev->mmio_self_dbmsg->idb_vec_map); 247 | + 248 | + event_irq = ioread32(&sndev->stdev->mmio_part_cfg->vep_vector_number); 249 | + 250 | + /** 251 | + * Initalize vector number to be used for each DB and message register. 252 | + */ 253 | + for (i = 0; i < ((idb_vecs/2) - 4); i++) { 254 | + iowrite8((i + DB_START_VEC), 255 | + &sndev->mmio_self_dbmsg->idb_vec_map[i]); 256 | + } 257 | + 258 | + /* TODO we are leaving 4 db here as each partition using only 28 db's. 259 | + * Though i feel each partition can use 30 db's. out of 64, 4 are 260 | + * reserved for message registers and rest are free for db. so 60 can 261 | + * be divided between both partitions. for message resgister mmap is 262 | + * there which helps using same index in both partitions. It was same 263 | + * earlier as well. though its only theoritical understanding and yet 264 | + * to verify. Hence not touching current logic of db and message mgmt. 265 | + */ 266 | + for (i = (idb_vecs/2); i < (idb_vecs - 4); i++) { 267 | + iowrite8(((i - (idb_vecs/2)) + DB_START_VEC), 268 | + &sndev->mmio_self_dbmsg->idb_vec_map[i]); 269 | + } 270 | + 271 | + iowrite8(LINK_UP_VEC, &sndev->mmio_self_dbmsg->idb_vec_map[i++]); 272 | + iowrite8(LINK_DOWN_VEC, &sndev->mmio_self_dbmsg->idb_vec_map[i++]); 273 | + 274 | + for (; i < idb_vecs; i++) 275 | + iowrite8(MAXIMUM_VEC, &sndev->mmio_self_dbmsg->idb_vec_map[i]); 276 | + 277 | + dev_dbg(&sndev->stdev->dev, 278 | + "irqs - event: %d, db: [%d-%d], msgs: [%d-%d]", 279 | + event_irq, DB_START_VEC, DB_END_VEC, 280 | + LINK_DOWN_VEC, LINK_UP_VEC); 281 | + 282 | + /** 283 | + * Attach ISR for :- 284 | + * Message Register 1 to be used for NTB link up. 285 | + * Message Register 2 to be used for NTB link down. 286 | + * To all allowed doorbells. 287 | + */ 288 | + sndev->irqs[LINK_UP_VEC].irq_num = 289 | + pci_irq_vector(sndev->stdev->pdev, LINK_UP_VEC); 290 | + rc = request_irq(sndev->irqs[LINK_UP_VEC].irq_num, switchtec_isr, 0, 291 | + "switchtec_link_up_msg", sndev); 292 | + if (rc) { 293 | + goto deinit_db_msg_irq; 294 | + } else { 295 | + sndev->irqs[LINK_UP_VEC].isr_attached = true; 296 | + } 297 | + 298 | + sndev->irqs[LINK_DOWN_VEC].irq_num = 299 | + pci_irq_vector(sndev->stdev->pdev, LINK_DOWN_VEC); 300 | + rc = request_irq(sndev->irqs[LINK_DOWN_VEC].irq_num, switchtec_isr, 0, 301 | + "switchtec_link_down_msg", sndev); 302 | + if (rc) { 303 | + goto deinit_db_msg_irq; 304 | + } else { 305 | + sndev->irqs[LINK_DOWN_VEC].isr_attached = true; 306 | + } 307 | + 308 | + bit = 0; 309 | + for_each_set_bit(bit, (long unsigned int *)(&sndev->db_valid_mask), 310 | + fls(ALLOWED_DB)) { 311 | + int idx = 0; 312 | + idx = (DB_START_VEC + bit); 313 | + sndev->irqs[idx].irq_num = pci_irq_vector(sndev->stdev->pdev, 314 | + idx); 315 | + rc = request_irq(sndev->irqs[idx].irq_num, switchtec_isr, 0, 316 | + "switchtec_doorbell", sndev); 317 | + if (rc) { 318 | + goto deinit_db_msg_irq; 319 | + } else { 320 | + sndev->irqs[idx].isr_attached = true; 321 | + } 322 | + } 323 | + 324 | + dev_dbg(&sndev->stdev->dev, "Registered switchtec_isr for db and msg"); 325 | + 326 | + return rc; 327 | + 328 | +deinit_db_msg_irq: 329 | + switchtec_ntb_deinit_db_msg_irq(sndev); 330 | + return rc; 331 | +} 332 | + 333 | +#else 334 | + 335 | static irqreturn_t switchtec_ntb_doorbell_isr(int irq, void *dev) 336 | { 337 | struct switchtec_ntb *sndev = dev; 338 | @@ -1121,6 +1341,7 @@ static void switchtec_ntb_deinit_db_msg_irq(struct switchtec_ntb *sndev) 339 | free_irq(sndev->doorbell_irq, sndev); 340 | free_irq(sndev->message_irq, sndev); 341 | } 342 | +#endif 343 | 344 | static int switchtec_ntb_add(struct device *dev, 345 | struct class_interface *class_intf) 346 | diff --git a/drivers/ntb/test/ntb_tool.c b/drivers/ntb/test/ntb_tool.c 347 | index 91526a9..34cdc46 100644 348 | --- a/drivers/ntb/test/ntb_tool.c 349 | +++ b/drivers/ntb/test/ntb_tool.c 350 | @@ -143,6 +143,9 @@ struct tool_ctx { 351 | wait_queue_head_t link_wq; 352 | int mw_count; 353 | struct tool_mw mws[MAX_MWS]; 354 | +#ifdef CONFIG_NTB_LINK_MGMT 355 | + bool retrigger_link; 356 | +#endif 357 | }; 358 | 359 | #define SPAD_FNAME_SIZE 0x10 360 | @@ -166,6 +169,20 @@ static void tool_link_event(void *ctx) 361 | 362 | up = ntb_link_is_up(tc->ntb, &speed, &width); 363 | 364 | +#ifdef CONFIG_NTB_LINK_MGMT 365 | + if (up) { 366 | + if (tc->retrigger_link) { 367 | + /* only once for every link up.*/ 368 | + ntb_link_enable(tc->ntb, NTB_SPEED_AUTO, 369 | + NTB_WIDTH_AUTO); 370 | + tc->retrigger_link = false; 371 | + } 372 | + } else { 373 | + /* link is down, we will have to do re-trigger on up.*/ 374 | + tc->retrigger_link = true; 375 | + } 376 | +#endif 377 | + 378 | dev_dbg(&tc->ntb->dev, "link is %s speed %d width %d\n", 379 | up ? "up" : "down", speed, width); 380 | 381 | @@ -957,6 +974,9 @@ static int tool_probe(struct ntb_client *self, struct ntb_dev *ntb) 382 | } 383 | 384 | tc->ntb = ntb; 385 | +#ifdef CONFIG_NTB_LINK_MGMT 386 | + tc->retrigger_link = true; 387 | +#endif 388 | init_waitqueue_head(&tc->link_wq); 389 | 390 | tc->mw_count = min(ntb_peer_mw_count(tc->ntb), MAX_MWS); 391 | diff --git a/drivers/pci/switch/switchtec.c b/drivers/pci/switch/switchtec.c 392 | index 730cc89..762ad9b 100644 393 | --- a/drivers/pci/switch/switchtec.c 394 | +++ b/drivers/pci/switch/switchtec.c 395 | @@ -1185,8 +1185,19 @@ static int switchtec_init_isr(struct switchtec_dev *stdev) 396 | int nvecs; 397 | int event_irq; 398 | 399 | +#ifdef CONFIG_NTB_LINK_MGMT 400 | + if (stdev->mmio_sys_info->device_id != 0x8534) { 401 | + nvecs = pci_alloc_irq_vectors(stdev->pdev, 1, 4, 402 | + PCI_IRQ_MSIX | PCI_IRQ_MSI); 403 | + } else { 404 | + /* 1 MSI-X per NTB DB/MSG. - NVIDIA*/ 405 | + nvecs = pci_alloc_irq_vectors(stdev->pdev, 32, 32, 406 | + PCI_IRQ_MSIX | PCI_IRQ_MSI); 407 | + } 408 | +#else 409 | nvecs = pci_alloc_irq_vectors(stdev->pdev, 1, 4, 410 | PCI_IRQ_MSIX | PCI_IRQ_MSI); 411 | +#endif 412 | if (nvecs < 0) 413 | return nvecs; 414 | 415 | -------------------------------------------------------------------------------- /4.18.0.22/ntb_link_mgmt_Ubuntu-hwe-4.18.0-22.23.patch: -------------------------------------------------------------------------------- 1 | diff --git a/drivers/ntb/Kconfig b/drivers/ntb/Kconfig 2 | index 95944e5..6803955 100644 3 | --- a/drivers/ntb/Kconfig 4 | +++ b/drivers/ntb/Kconfig 5 | @@ -12,6 +12,15 @@ menuconfig NTB 6 | 7 | if NTB 8 | 9 | +config NTB_LINK_MGMT 10 | + bool "Link management to avoid PCIe Rd for link detection" 11 | + help 12 | + This is an additional option that is based on using 1 MSI-X per NTB DB 13 | + and moves the NTB link detection to use purely MSI rather than making 14 | + CPU read of MAGIC bytes over PCIe. 15 | + 16 | + If unsure, say N. 17 | + 18 | source "drivers/ntb/hw/Kconfig" 19 | 20 | source "drivers/ntb/test/Kconfig" 21 | diff --git a/drivers/ntb/hw/mscc/ntb_hw_switchtec.c b/drivers/ntb/hw/mscc/ntb_hw_switchtec.c 22 | index f624ae2..9d6da0a 100644 23 | --- a/drivers/ntb/hw/mscc/ntb_hw_switchtec.c 24 | +++ b/drivers/ntb/hw/mscc/ntb_hw_switchtec.c 25 | @@ -67,6 +67,30 @@ static inline void _iowrite64(u64 val, void __iomem *mmio) 26 | #define SWITCHTEC_NTB_MAGIC 0x45CC0001 27 | #define MAX_MWS 128 28 | 29 | +#ifdef CONFIG_NTB_LINK_MGMT 30 | +/* in-case we wanted to supported DB less than default(28). 31 | + * This value should never exceed set MSB > 28. Because we do not 32 | + * have DT support, either use this or make it module parameter. 33 | + */ 34 | +#define ALLOWED_DB (0x0FFFFFFF) 35 | + 36 | +/* assign distinct msi-x vectors to each db/msg.*/ 37 | +enum { 38 | + LINK_DOWN_VEC = 0, 39 | + LINK_UP_VEC = 1, 40 | + PARTITION_EVENT_VEC = 2, 41 | + DB_START_VEC = 3, 42 | + DB_END_VEC = 30, 43 | + UNUSED_VEC = 31, 44 | + MAXIMUM_VEC = 32, 45 | +}; 46 | + 47 | +struct switchtec_irq { 48 | + int irq_num; 49 | + bool isr_attached; 50 | +}; 51 | +#endif //CONFIG_NTB_LINK_MGMT 52 | + 53 | struct shared_mw { 54 | u32 magic; 55 | u32 link_sta; 56 | @@ -85,9 +109,12 @@ struct switchtec_ntb { 57 | int self_partition; 58 | int peer_partition; 59 | 60 | +#ifdef CONFIG_NTB_LINK_MGMT 61 | + struct switchtec_irq irqs[MAXIMUM_VEC]; 62 | +#else 63 | int doorbell_irq; 64 | int message_irq; 65 | - 66 | +#endif 67 | struct ntb_info_regs __iomem *mmio_ntb; 68 | struct ntb_ctrl_regs __iomem *mmio_ctrl; 69 | struct ntb_dbmsg_regs __iomem *mmio_dbmsg; 70 | @@ -516,6 +543,16 @@ enum switchtec_msg { 71 | MSG_LINK_FORCE_DOWN = 4, 72 | }; 73 | 74 | +#ifdef CONFIG_NTB_LINK_MGMT 75 | +enum { 76 | + MSG_REG_1 = 0, 77 | + MSG_REG_2 = 1, 78 | + MSG_REG_3 = 2, 79 | + MSG_REG_4 = 3, 80 | +}; 81 | +#endif 82 | + 83 | + 84 | static int switchtec_ntb_reinit_peer(struct switchtec_ntb *sndev); 85 | 86 | static void link_reinit_work(struct work_struct *work) 87 | @@ -559,7 +596,15 @@ static void switchtec_ntb_check_link(struct switchtec_ntb *sndev, 88 | switchtec_ntb_set_link_speed(sndev); 89 | 90 | if (link_sta != old) { 91 | +#ifdef CONFIG_NTB_LINK_MGMT 92 | + if (link_sta) { 93 | + switchtec_ntb_send_msg(sndev, MSG_REG_1, MSG_LINK_UP); 94 | + } else { 95 | + switchtec_ntb_send_msg(sndev, MSG_REG_2, MSG_LINK_DOWN); 96 | + } 97 | +#else 98 | switchtec_ntb_send_msg(sndev, LINK_MESSAGE, MSG_CHECK_LINK); 99 | +#endif 100 | ntb_link_event(&sndev->ntb); 101 | dev_info(&sndev->stdev->dev, "ntb link %s\n", 102 | link_sta ? "up" : "down"); 103 | @@ -599,10 +644,13 @@ static int switchtec_ntb_link_enable(struct ntb_dev *ntb, 104 | dev_dbg(&sndev->stdev->dev, "enabling link\n"); 105 | 106 | sndev->self_shared->link_sta = 1; 107 | +#ifdef CONFIG_NTB_LINK_MGMT 108 | + switchtec_ntb_send_msg(sndev, MSG_REG_1, MSG_LINK_UP); 109 | +#else 110 | switchtec_ntb_send_msg(sndev, LINK_MESSAGE, MSG_LINK_UP); 111 | 112 | switchtec_ntb_check_link(sndev, MSG_CHECK_LINK); 113 | - 114 | +#endif 115 | return 0; 116 | } 117 | 118 | @@ -613,10 +661,13 @@ static int switchtec_ntb_link_disable(struct ntb_dev *ntb) 119 | dev_dbg(&sndev->stdev->dev, "disabling link\n"); 120 | 121 | sndev->self_shared->link_sta = 0; 122 | +#ifdef CONFIG_NTB_LINK_MGMT 123 | + switchtec_ntb_send_msg(sndev, MSG_REG_2, MSG_LINK_DOWN); 124 | +#else 125 | switchtec_ntb_send_msg(sndev, LINK_MESSAGE, MSG_LINK_DOWN); 126 | 127 | switchtec_ntb_check_link(sndev, MSG_CHECK_LINK); 128 | - 129 | +#endif 130 | return 0; 131 | } 132 | 133 | @@ -629,16 +680,25 @@ static u64 switchtec_ntb_db_valid_mask(struct ntb_dev *ntb) 134 | 135 | static int switchtec_ntb_db_vector_count(struct ntb_dev *ntb) 136 | { 137 | +#ifdef CONFIG_NTB_LINK_MGMT 138 | + struct switchtec_ntb *sndev = ntb_sndev(ntb); 139 | + return fls(sndev->db_valid_mask); 140 | +#else 141 | return 1; 142 | +#endif 143 | } 144 | 145 | static u64 switchtec_ntb_db_vector_mask(struct ntb_dev *ntb, int db_vector) 146 | { 147 | struct switchtec_ntb *sndev = ntb_sndev(ntb); 148 | 149 | +#ifdef CONFIG_NTB_LINK_MGMT 150 | + if (db_vector < DB_START_VEC || db_vector > DB_END_VEC) 151 | + return 0; 152 | +#else 153 | if (db_vector < 0 || db_vector > 1) 154 | return 0; 155 | - 156 | +#endif 157 | return sndev->db_valid_mask; 158 | } 159 | 160 | @@ -1261,6 +1321,9 @@ static void switchtec_ntb_init_db(struct switchtec_ntb *sndev) 161 | sndev->db_valid_mask = 0x0FFFFFFF; 162 | } 163 | 164 | +#ifdef CONFIG_NTB_LINK_MGMT 165 | + sndev->db_valid_mask = ALLOWED_DB; 166 | +#endif 167 | iowrite64(~sndev->db_mask, &sndev->mmio_self_dbmsg->idb_mask); 168 | iowrite64(sndev->db_valid_mask << sndev->db_peer_shift, 169 | &sndev->mmio_peer_dbmsg->odb_mask); 170 | @@ -1311,7 +1374,9 @@ static void switchtec_ntb_init_shared(struct switchtec_ntb *sndev) 171 | int i; 172 | 173 | memset(sndev->self_shared, 0, LUT_SIZE); 174 | +#ifndef CONFIG_NTB_LINK_MGMT 175 | sndev->self_shared->magic = SWITCHTEC_NTB_MAGIC; 176 | +#endif 177 | sndev->self_shared->partition_id = sndev->stdev->partition; 178 | 179 | for (i = 0; i < sndev->nr_direct_mw; i++) { 180 | @@ -1384,6 +1449,171 @@ static void switchtec_ntb_deinit_shared_mw(struct switchtec_ntb *sndev) 181 | sndev->nr_rsvd_luts--; 182 | } 183 | 184 | +#ifdef CONFIG_NTB_LINK_MGMT 185 | +static irqreturn_t switchtec_isr(int irq, void *dev) 186 | +{ 187 | + struct switchtec_ntb *sndev = dev; 188 | + int i = 0; 189 | + 190 | + for (i = DB_START_VEC; i <= DB_END_VEC; i++) { 191 | + if (irq == sndev->irqs[i].irq_num) { 192 | + dev_dbg(&sndev->stdev->dev, 193 | + "doorbell: (%d)\n", (i - DB_START_VEC)); 194 | + ntb_db_event(&sndev->ntb, (i - DB_START_VEC)); 195 | + return IRQ_HANDLED; 196 | + } 197 | + } 198 | + 199 | + /* we do not read message register to know the message type: 200 | + * UP/DOWN/FORCE_DOWN as we avoid reading over PCIe by CPU. 201 | + * Trust the remote to use the correct message register. 202 | + * FORCE_LINK_DOWN must not use same register as LINK_UP. 203 | + */ 204 | + 205 | + if (irq == sndev->irqs[LINK_UP_VEC].irq_num) { 206 | + bool crosslink_db_init = false; 207 | + if (sndev->link_is_up == 0) { 208 | + /* 209 | + * link was down earlier. We are expected to get 210 | + * continuous LINK_UP msgs with NTB link HB design. 211 | + * Init (again) the DB when in crosslink mode when 212 | + * transition from down->up. 213 | + */ 214 | + crosslink_db_init = true; 215 | + } 216 | + sndev->link_is_up = 1; 217 | + dev_dbg(&sndev->stdev->dev, "ntb link up"); 218 | + iowrite8(1, &sndev->mmio_self_dbmsg->imsg[MSG_REG_1].status); 219 | + switchtec_ntb_set_link_speed(sndev); 220 | + if (crosslink_db_init == true) { 221 | + crosslink_init_dbmsgs(sndev); 222 | + } 223 | + ntb_link_event(&sndev->ntb); 224 | + 225 | + return IRQ_HANDLED; 226 | + } 227 | + 228 | + if (irq == sndev->irqs[LINK_DOWN_VEC].irq_num) { 229 | + sndev->link_is_up = 0; 230 | + dev_dbg(&sndev->stdev->dev, "ntb link down"); 231 | + iowrite8(1, &sndev->mmio_self_dbmsg->imsg[MSG_REG_2].status); 232 | + switchtec_ntb_set_link_speed(sndev); 233 | + ntb_link_event(&sndev->ntb); 234 | + 235 | + return IRQ_HANDLED; 236 | + } 237 | + 238 | + return IRQ_HANDLED; 239 | +} 240 | + 241 | +static void switchtec_ntb_deinit_db_msg_irq(struct switchtec_ntb *sndev) 242 | +{ 243 | + int i = 0; 244 | + 245 | + for (i = LINK_DOWN_VEC; i < MAXIMUM_VEC; i++) { 246 | + if (sndev->irqs[i].isr_attached) { 247 | + free_irq(sndev->irqs[i].irq_num, sndev); 248 | + sndev->irqs[i].isr_attached = false; 249 | + sndev->irqs[i].irq_num = 0; 250 | + } 251 | + } 252 | +} 253 | + 254 | +static int switchtec_ntb_init_db_msg_irq(struct switchtec_ntb *sndev) 255 | +{ 256 | + int i; 257 | + int rc; 258 | + int event_irq; 259 | + uint32_t bit; 260 | + int idb_vecs = sizeof(sndev->mmio_self_dbmsg->idb_vec_map); 261 | + 262 | + event_irq = ioread32(&sndev->stdev->mmio_part_cfg->vep_vector_number); 263 | + 264 | + /** 265 | + * Initalize vector number to be used for each DB and message register. 266 | + */ 267 | + for (i = 0; i < ((idb_vecs/2) - 4); i++) { 268 | + iowrite8((i + DB_START_VEC), 269 | + &sndev->mmio_self_dbmsg->idb_vec_map[i]); 270 | + } 271 | + 272 | + /* TODO we are leaving 4 db here as each partition using only 28 db's. 273 | + * Though i feel each partition can use 30 db's. out of 64, 4 are 274 | + * reserved for message registers and rest are free for db. so 60 can 275 | + * be divided between both partitions. for message resgister mmap is 276 | + * there which helps using same index in both partitions. It was same 277 | + * earlier as well. though its only theoritical understanding and yet 278 | + * to verify. Hence not touching current logic of db and message mgmt. 279 | + */ 280 | + for (i = (idb_vecs/2); i < (idb_vecs - 4); i++) { 281 | + iowrite8(((i - (idb_vecs/2)) + DB_START_VEC), 282 | + &sndev->mmio_self_dbmsg->idb_vec_map[i]); 283 | + } 284 | + 285 | + iowrite8(LINK_UP_VEC, &sndev->mmio_self_dbmsg->idb_vec_map[i++]); 286 | + iowrite8(LINK_DOWN_VEC, &sndev->mmio_self_dbmsg->idb_vec_map[i++]); 287 | + 288 | + for (; i < idb_vecs; i++) 289 | + iowrite8(MAXIMUM_VEC, &sndev->mmio_self_dbmsg->idb_vec_map[i]); 290 | + 291 | + dev_dbg(&sndev->stdev->dev, 292 | + "irqs - event: %d, db: [%d-%d], msgs: [%d-%d]", 293 | + event_irq, DB_START_VEC, DB_END_VEC, 294 | + LINK_DOWN_VEC, LINK_UP_VEC); 295 | + 296 | + /** 297 | + * Attach ISR for :- 298 | + * Message Register 1 to be used for NTB link up. 299 | + * Message Register 2 to be used for NTB link down. 300 | + * To all allowed doorbells. 301 | + */ 302 | + sndev->irqs[LINK_UP_VEC].irq_num = 303 | + pci_irq_vector(sndev->stdev->pdev, LINK_UP_VEC); 304 | + rc = request_irq(sndev->irqs[LINK_UP_VEC].irq_num, switchtec_isr, 0, 305 | + "switchtec_link_up_msg", sndev); 306 | + if (rc) { 307 | + goto deinit_db_msg_irq; 308 | + } else { 309 | + sndev->irqs[LINK_UP_VEC].isr_attached = true; 310 | + } 311 | + 312 | + sndev->irqs[LINK_DOWN_VEC].irq_num = 313 | + pci_irq_vector(sndev->stdev->pdev, LINK_DOWN_VEC); 314 | + rc = request_irq(sndev->irqs[LINK_DOWN_VEC].irq_num, switchtec_isr, 0, 315 | + "switchtec_link_down_msg", sndev); 316 | + if (rc) { 317 | + goto deinit_db_msg_irq; 318 | + } else { 319 | + sndev->irqs[LINK_DOWN_VEC].isr_attached = true; 320 | + } 321 | + 322 | + bit = 0; 323 | + for_each_set_bit(bit, (long unsigned int *)(&sndev->db_valid_mask), 324 | + fls(ALLOWED_DB)) { 325 | + int idx = 0; 326 | + idx = (DB_START_VEC + bit); 327 | + sndev->irqs[idx].irq_num = pci_irq_vector(sndev->stdev->pdev, 328 | + idx); 329 | + rc = request_irq(sndev->irqs[idx].irq_num, switchtec_isr, 0, 330 | + "switchtec_doorbell", sndev); 331 | + if (rc) { 332 | + goto deinit_db_msg_irq; 333 | + } else { 334 | + sndev->irqs[idx].isr_attached = true; 335 | + } 336 | + } 337 | + 338 | + dev_dbg(&sndev->stdev->dev, "Registered switchtec_isr for db and msg"); 339 | + 340 | + return rc; 341 | + 342 | +deinit_db_msg_irq: 343 | + switchtec_ntb_deinit_db_msg_irq(sndev); 344 | + return rc; 345 | +} 346 | + 347 | +#else 348 | + 349 | static irqreturn_t switchtec_ntb_doorbell_isr(int irq, void *dev) 350 | { 351 | struct switchtec_ntb *sndev = dev; 352 | @@ -1469,6 +1699,7 @@ static void switchtec_ntb_deinit_db_msg_irq(struct switchtec_ntb *sndev) 353 | free_irq(sndev->doorbell_irq, sndev); 354 | free_irq(sndev->message_irq, sndev); 355 | } 356 | +#endif 357 | 358 | static int switchtec_ntb_reinit_peer(struct switchtec_ntb *sndev) 359 | { 360 | @@ -1520,12 +1751,24 @@ static int switchtec_ntb_add(struct device *dev, 361 | if (rc) 362 | goto deinit_shared_and_exit; 363 | 364 | +#ifdef CONFIG_NTB_LINK_MGMT 365 | + /* either we use MSG_REG_2 to send FORCE_DOWN message or 366 | + * do not send it at as we have our link management to detect 367 | + * remote going away. Basically, we avoid read over PCIe and 368 | + * therefore use same register as LINK_DOWN for FORCE_DOWN 369 | + * as well. 370 | + * 371 | + * We choose to disable this as we have our own way to detect 372 | + * remote going away. 373 | + */ 374 | +#else 375 | /* 376 | * If this host crashed, the other host may think the link is 377 | * still up. Tell them to force it down (it will go back up 378 | * once we register the ntb device). 379 | */ 380 | switchtec_ntb_send_msg(sndev, LINK_MESSAGE, MSG_LINK_FORCE_DOWN); 381 | +#endif 382 | 383 | rc = ntb_register_device(&sndev->ntb); 384 | if (rc) 385 | diff --git a/drivers/ntb/test/ntb_tool.c b/drivers/ntb/test/ntb_tool.c 386 | index d592c0f..149bf5b 100644 387 | --- a/drivers/ntb/test/ntb_tool.c 388 | +++ b/drivers/ntb/test/ntb_tool.c 389 | @@ -267,6 +267,9 @@ struct tool_ctx { 390 | int inspad_cnt; 391 | struct tool_spad *inspads; 392 | struct dentry *dbgfs_dir; 393 | +#ifdef CONFIG_NTB_LINK_MGMT 394 | + bool retrigger_link; 395 | +#endif 396 | }; 397 | 398 | #define TOOL_FOPS_RDWR(__name, __read, __write) \ 399 | @@ -295,6 +298,20 @@ static void tool_link_event(void *ctx) 400 | 401 | up = ntb_link_is_up(tc->ntb, &speed, &width); 402 | 403 | +#ifdef CONFIG_NTB_LINK_MGMT 404 | + if (up) { 405 | + if (tc->retrigger_link) { 406 | + /* only once for every link up.*/ 407 | + ntb_link_enable(tc->ntb, NTB_SPEED_AUTO, 408 | + NTB_WIDTH_AUTO); 409 | + tc->retrigger_link = false; 410 | + } 411 | + } else { 412 | + /* link is down, we will have to do re-trigger on up.*/ 413 | + tc->retrigger_link = true; 414 | + } 415 | +#endif 416 | + 417 | dev_dbg(&tc->ntb->dev, "link is %s speed %d width %d\n", 418 | up ? "up" : "down", speed, width); 419 | 420 | @@ -1451,6 +1468,10 @@ static struct tool_ctx *tool_create_data(struct ntb_dev *ntb) 421 | init_waitqueue_head(&tc->db_wq); 422 | init_waitqueue_head(&tc->msg_wq); 423 | 424 | +#ifdef CONFIG_NTB_LINK_MGMT 425 | + tc->retrigger_link = true; 426 | +#endif 427 | + 428 | if (ntb_db_is_unsafe(ntb)) 429 | dev_dbg(&ntb->dev, "doorbell is unsafe\n"); 430 | 431 | diff --git a/drivers/pci/switch/switchtec.c b/drivers/pci/switch/switchtec.c 432 | index f96af14..b69a245 100644 433 | --- a/drivers/pci/switch/switchtec.c 434 | +++ b/drivers/pci/switch/switchtec.c 435 | @@ -1181,8 +1181,19 @@ static int switchtec_init_isr(struct switchtec_dev *stdev) 436 | int nvecs; 437 | int event_irq; 438 | 439 | +#ifdef CONFIG_NTB_LINK_MGMT 440 | + if (stdev->mmio_sys_info->device_id != 0x8534) { 441 | + nvecs = pci_alloc_irq_vectors(stdev->pdev, 1, 4, 442 | + PCI_IRQ_MSIX | PCI_IRQ_MSI); 443 | + } else { 444 | + /* 1 MSI-X per NTB DB/MSG. - NVIDIA*/ 445 | + nvecs = pci_alloc_irq_vectors(stdev->pdev, 32, 32, 446 | + PCI_IRQ_MSIX | PCI_IRQ_MSI); 447 | + } 448 | +#else 449 | nvecs = pci_alloc_irq_vectors(stdev->pdev, 1, 4, 450 | PCI_IRQ_MSIX | PCI_IRQ_MSI); 451 | +#endif 452 | if (nvecs < 0) 453 | return nvecs; 454 | 455 | -------------------------------------------------------------------------------- /4.18.0.22/nvscic2c_Ubuntu-hwe-4.18.0-22.23.patch: -------------------------------------------------------------------------------- 1 | diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig 2 | index 3726eac..0aec171 100644 3 | --- a/drivers/misc/Kconfig 4 | +++ b/drivers/misc/Kconfig 5 | @@ -527,4 +527,5 @@ source "drivers/misc/echo/Kconfig" 6 | source "drivers/misc/cxl/Kconfig" 7 | source "drivers/misc/ocxl/Kconfig" 8 | source "drivers/misc/cardreader/Kconfig" 9 | +source "drivers/misc/nvscic2c/Kconfig" 10 | endmenu 11 | diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile 12 | index af22bbc..e917a76 100644 13 | --- a/drivers/misc/Makefile 14 | +++ b/drivers/misc/Makefile 15 | @@ -58,3 +58,4 @@ obj-$(CONFIG_ASPEED_LPC_SNOOP) += aspeed-lpc-snoop.o 16 | obj-$(CONFIG_PCI_ENDPOINT_TEST) += pci_endpoint_test.o 17 | obj-$(CONFIG_OCXL) += ocxl/ 18 | obj-$(CONFIG_MISC_RTSX) += cardreader/ 19 | +obj-$(CONFIG_NVSCIC2C) += nvscic2c/ 20 | diff --git a/drivers/misc/nvscic2c/Kconfig b/drivers/misc/nvscic2c/Kconfig 21 | new file mode 100644 22 | index 0000000..8de3534 23 | --- /dev/null 24 | +++ b/drivers/misc/nvscic2c/Kconfig 25 | @@ -0,0 +1,31 @@ 26 | +if X86_64 27 | +config NVSCIC2C 28 | + tristate "Enable Nvidia Host-to-Host data transfer over PCIe-NTB module" 29 | + depends on NTB && NTB_SWITCHTEC 30 | + select NTB_LINK_MGMT 31 | + default n 32 | + help 33 | + This enables SoftwareCommunicationInterface for Host-to-Host 34 | + communication over PCIe. This is possible only via NTB at the 35 | + moment and for the MicroSemi/MicroChip PM8534 switch with NTB 36 | + vEPs. We also enable the NTB link management that is introduced 37 | + by NVIDIA Corp Ltd. to not make PCIe Rd by CPU to detect remote 38 | + link UP. 39 | + If unsure, Please say N. 40 | +endif 41 | + 42 | +if ARCH_TEGRA 43 | +config NVSCIC2C 44 | + tristate "Enable Nvidia Host-to-Host data transfer over PCIe-NTB module" 45 | + depends on NTB && ARCH_TEGRA_19x_SOC 46 | + select NTB_LINK_MGMT 47 | + help 48 | + This enables SoftwareCommunicationInterface for Host-to-Host 49 | + communication over PCIe. This is possible only via NTB at the 50 | + moment and for the MicroSemi/MicroChip PM8534 switch with NTB 51 | + vEPs. We also enable the NTB link management that is introduced 52 | + by NVIDIA Corp Ltd. to not make PCIe Rd by CPU to detect remote 53 | + link UP. 54 | + If unsure, Please say N. 55 | +endif 56 | +# removed NTB_SWITCHTEC for tegra on k-4.9 57 | diff --git a/drivers/misc/nvscic2c/Makefile b/drivers/misc/nvscic2c/Makefile 58 | new file mode 100644 59 | index 0000000..65a6c1c 60 | --- /dev/null 61 | +++ b/drivers/misc/nvscic2c/Makefile 62 | @@ -0,0 +1,24 @@ 63 | +# 64 | +# drivers/misc/nvsci-c2c-x86/Makefile 65 | +# 66 | +# This program is free software; you can redistribute it and/or modify it 67 | +# under the terms and conditions of the GNU General Public License, 68 | +# version 2, as published by the Free Software Foundation. 69 | +# 70 | +# This program is distributed in the hope it will be useful, but WITHOUT 71 | +# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 72 | +# FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for 73 | +# more details. 74 | +# 75 | +# Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved. 76 | +# 77 | + 78 | +ccflags-y += -Werror 79 | +obj-$(CONFIG_NVSCIC2C) += nvscic2c.o 80 | +nvscic2c-objs += channel-cdev.o \ 81 | + channel-ops.o \ 82 | + config.o \ 83 | + link-mgmt.o \ 84 | + module.o \ 85 | + ntb-client.o 86 | +nvscic2c-$(CONFIG_DEBUG_FS) += channel-dbgfs.o 87 | diff --git a/drivers/misc/nvscic2c/channel-cdev.c b/drivers/misc/nvscic2c/channel-cdev.c 88 | new file mode 100644 89 | index 0000000..b312acbd 90 | --- /dev/null 91 | +++ b/drivers/misc/nvscic2c/channel-cdev.c 92 | @@ -0,0 +1,860 @@ 93 | +/* 94 | + * Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved. 95 | + * 96 | + * This program is free software; you can redistribute it and/or modify it 97 | + * under the terms and conditions of the GNU General Public License, 98 | + * version 2, as published by the Free Software Foundation. 99 | + * 100 | + * This program is distributed in the hope it will be useful, but WITHOUT 101 | + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 102 | + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for 103 | + * more details. 104 | + */ 105 | + 106 | +#include "channel.h" 107 | +#include "chip-to-chip.h" 108 | +#include 109 | +#include 110 | +#include 111 | +#include 112 | +#include 113 | +#include 114 | +#include 115 | +#include 116 | +#include 117 | +#include 118 | +#include 119 | + 120 | + 121 | +/* 122 | + * Deals with Chip-To-Chip channel devices. 123 | + * - creating char devices for each nvscic2c channel 124 | + * - implement the fops: open(), close(), mmap(), poll(), ioctl(). 125 | + * - We shall not support Read, Write from User space for this device 126 | + * - Every channel also has a chunk of Self(Rx) and Peer(Tx) memory 127 | + * across PCIe. 128 | + * - Every channel also allocates a private memory exposed to user-space 129 | + * but not across PCIe. 130 | + */ 131 | + 132 | +/* prototype.*/ 133 | +static int ioctl_get_info_impl(struct channel_t *channel, 134 | + struct nvscic2c_info *get_info); 135 | + 136 | +/* prototype.*/ 137 | +static int ioctl_notify_remote_impl(struct channel_t *channel, 138 | + uint8_t *db_bits); 139 | + 140 | +/* prototype.*/ 141 | +static int update_db_ch_tbl(struct channel_drv_ctx_t *ch_drv_ctx, 142 | + struct channel_t *channel, bool add); 143 | + 144 | + 145 | +/* Channel cdev context. If not making it global here, add it to c2c_drv_ctx.*/ 146 | +static struct channel_drv_ctx_t *ch_drv_ctx; 147 | + 148 | + 149 | +/* 150 | + * open() syscall backing for nvscic2c channel devices. 151 | + * 152 | + * We do not allow same channel to be opened more than once on the lines of 153 | + * ivc/ivm mempools. 154 | + * 155 | + * Populate the channel_device internal data-structure into fops private data 156 | + * for subsequent calls to other fops handlers. 157 | + */ 158 | +static int channel_fops_open(struct inode *inode, struct file *filp) 159 | +{ 160 | + int ret = 0; 161 | + struct channel_t *channel = 162 | + container_of(inode->i_cdev, struct channel_t, cdev); 163 | + 164 | + mutex_lock(&(channel->fops_lock)); 165 | + if (channel->in_use) { 166 | + /* already in use.*/ 167 | + ret = -EBUSY; 168 | + } else { 169 | + channel->in_use = true; 170 | + } 171 | + mutex_unlock(&(channel->fops_lock)); 172 | + 173 | + /* propagate link and state change events that occur after the device 174 | + * is opened and not the stale ones. 175 | + */ 176 | + atomic_set(&(channel->db_event), 0); 177 | + atomic_set(&(channel->link_change_event), 0); 178 | + 179 | + filp->private_data = channel; 180 | + return ret; 181 | +} 182 | + 183 | + 184 | +/* close() syscall backing for nvscic2c channel devices.*/ 185 | +static int channel_fops_release(struct inode *inode, struct file *filp) 186 | +{ 187 | + int ret = 0; 188 | + struct channel_t *channel = filp->private_data; 189 | + 190 | + if (WARN_ON(!(channel != NULL))) 191 | + return -EFAULT; 192 | + 193 | + mutex_lock(&(channel->fops_lock)); 194 | + if (channel->in_use) 195 | + channel->in_use = false; 196 | + mutex_unlock(&(channel->fops_lock)); 197 | + 198 | + filp->private_data = NULL; 199 | + return ret; 200 | +} 201 | + 202 | + 203 | +/* 204 | + * mmap() syscall backing for nvscic2c channel devices. 205 | + * 206 | + * We support mapping Four distinct regions of memory that each nvscic2c 207 | + * channel owns to user-space: 208 | + * - Peer's memory for same channel(used for Tx), 209 | + * - Self's memory (used for Rx), 210 | + * - Self Private memory(not exposed to Peer). 211 | + * - NTB link control memory(common to all channels). 212 | + * We map just one segment of memory in each call based on the information 213 | + * (which memory segment) provided by user-space code. 214 | + * 215 | + * We have added a strict check which makes user-space SW map area of memory 216 | + * which we exported, if user-space SW wanted to map little of it (but starting 217 | + * from base offset: 0 allow by relaxing the check in the function below. 218 | + */ 219 | +static int channel_fops_mmap(struct file *filp, struct vm_area_struct *vma) 220 | +{ 221 | + int ret = 0; 222 | + uint64_t mmap_type = vma->vm_pgoff; 223 | + uint64_t memaddr = 0x0; 224 | + uint64_t memsize = 0x0; 225 | + struct channel_t *channel = filp->private_data; 226 | + 227 | + if (WARN_ON(!(channel != NULL))) 228 | + return -EFAULT; 229 | + 230 | + if (WARN_ON(!(vma != NULL))) 231 | + return -EFAULT; 232 | + 233 | + mutex_lock(&(channel->fops_lock)); 234 | + 235 | + switch (mmap_type) { 236 | + case PEER_MEM_MMAP: 237 | + vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); 238 | + memaddr = channel->tx_mem.aper; 239 | + memsize = channel->tx_mem.size; 240 | + break; 241 | + case SELF_MEM_MMAP: 242 | + memsize = channel->rx_mem.size; 243 | + break; 244 | + case CTRL_MEM_MMAP: 245 | + memaddr = channel->ctrl_mem.phys_addr; 246 | + memsize = channel->ctrl_mem.size; 247 | + break; 248 | + case LINK_MEM_MMAP: 249 | + if (vma->vm_flags & VM_WRITE) { 250 | + ret = -EPERM; 251 | + goto exit; 252 | + } 253 | + memsize = link_mgmt_get_status_mem_size(); 254 | + break; 255 | + default: 256 | + ERR("(%s): unrecognised mmap type: (%llu)\n", 257 | + channel->name, mmap_type); 258 | + goto exit; 259 | + } 260 | + 261 | + if ((vma->vm_end - vma->vm_start) != memsize) { 262 | + ERR("(%s): mmap type: (%llu), memsize mismatch\n", 263 | + channel->name, mmap_type); 264 | + goto exit; 265 | + } 266 | + 267 | + vma->vm_pgoff = 0; 268 | + vma->vm_flags |= (VM_DONTCOPY); // fork() not supported. 269 | + switch (mmap_type) { 270 | + case SELF_MEM_MMAP: 271 | + /* becuse channel doesn't have ntbdev to mmap in case 272 | + * memory was allocated via dma_alloc_coherent(). 273 | + */ 274 | + ret = ntb_client_mmap_self_mem(vma, &(channel->rx_mem)); 275 | + break; 276 | + case LINK_MEM_MMAP: 277 | + /* becuse link mgmt is an abstraction and we haven't 278 | + * memcopied the link status mem info with channel. 279 | + */ 280 | + ret = link_mgmt_mmap_status_mem(vma); 281 | + break; 282 | + default: 283 | + ret = remap_pfn_range(vma, vma->vm_start, 284 | + PFN_DOWN(memaddr), 285 | + memsize, vma->vm_page_prot); 286 | + break; 287 | + } 288 | + if (ret) { 289 | + ERR("(%s): mmap() failed, mmap type:(%llu)\n", 290 | + channel->name, mmap_type); 291 | + } 292 | +exit: 293 | + mutex_unlock(&(channel->fops_lock)); 294 | + return ret; 295 | +} 296 | + 297 | + 298 | +/* 299 | + * poll() syscall backing for nvscic2c channel devices. 300 | + * 301 | + * user-space code shall call poll with FD on read, write and probably exception 302 | + * for channel state changes. 303 | + * 304 | + * If we are able to read(), write() or there is a pending state change event 305 | + * to be serviced, we return letting application call get_event(), otherwise 306 | + * kernel f/w will wait for waitq activity to occur. 307 | + */ 308 | +static unsigned int channel_fops_poll(struct file *filp, poll_table *wait) 309 | +{ 310 | + int ret = 0; 311 | + struct channel_t *channel = filp->private_data; 312 | + 313 | + if (WARN_ON(!(channel != NULL))) 314 | + return -EFAULT; 315 | + 316 | + mutex_lock(&(channel->fops_lock)); 317 | + 318 | + /* add all waitq if they are different for read, write & link+state.*/ 319 | + poll_wait(filp, &(channel->waitq), wait); 320 | + 321 | + /* wake up read, write (& exception - those who want to use) fd on 322 | + * getting link+state change event. 323 | + */ 324 | + if (atomic_read(&(channel->link_change_event))) { 325 | + /* Pending link event. */ 326 | + atomic_dec(&channel->link_change_event); 327 | + ret = (POLLPRI | POLLIN | POLLOUT); 328 | + } else if (atomic_read(&(channel->db_event))) { 329 | + /* Pending doorbell events from remote. */ 330 | + atomic_dec(&channel->db_event); 331 | + ret = (POLLPRI | POLLIN | POLLOUT); 332 | + } 333 | + 334 | + mutex_unlock(&(channel->fops_lock)); 335 | + return ret; 336 | +} 337 | + 338 | + 339 | +/* 340 | + * ioctl() syscall backing for nvscic2c channel devices. 341 | + * 342 | + * We expose ioctls() for: 343 | + * - Passing all the memory segments information to user in one ioctl call. 344 | + * - user can notify Peer via an ioctl() call. 345 | + */ 346 | +static long channel_fops_ioctl(struct file *filp, unsigned int cmd, 347 | + unsigned long arg) 348 | +{ 349 | + int ret = 0; 350 | + uint8_t buf[256] = {0}; 351 | + struct channel_t *channel = filp->private_data; 352 | + 353 | + if (WARN_ON(!(channel != NULL))) 354 | + return -EFAULT; 355 | + 356 | + /* validate the cmd */ 357 | + if ((_IOC_TYPE(cmd) != NVSCIC2C_IOCTL_MAGIC) || 358 | + (_IOC_NR(cmd) == 0) || 359 | + (_IOC_NR(cmd) > NVSCIC2C_IOCTL_NUMBER_MAX) || 360 | + (_IOC_SIZE(cmd) > 256)) { 361 | + ERR("(%s): Incorrect ioctl cmd/cmd params/magic\n", 362 | + channel->name); 363 | + return -ENOTTY; 364 | + } 365 | + 366 | + /* copy the cmd if it was meant from user->kernel: notify_remote.*/ 367 | + (void) memset(buf, 0, sizeof(buf)); 368 | + if (_IOC_DIR(cmd) & _IOC_WRITE) { 369 | + if (copy_from_user(buf, (void __user *)arg, _IOC_SIZE(cmd))) 370 | + return -EFAULT; 371 | + } 372 | + 373 | + switch (cmd) { 374 | + case NVSCIC2C_IOCTL_GET_INFO: 375 | + ret = ioctl_get_info_impl(channel, 376 | + (struct nvscic2c_info *) buf); 377 | + break; 378 | + case NVSCIC2C_IOCTL_NOTIFY_REMOTE: 379 | + ret = ioctl_notify_remote_impl(channel, (uint8_t *)buf); 380 | + break; 381 | + default: 382 | + ERR("(%s): unrecognised nvscic2c ioclt cmd: 0x%x\n", 383 | + channel->name, cmd); 384 | + ret = -ENOTTY; 385 | + break; 386 | + } 387 | + 388 | + /* copy the cmd result back to user if it was kernel->user: get_info.*/ 389 | + if ((ret == 0) && (_IOC_DIR(cmd) & _IOC_READ)) 390 | + ret = copy_to_user((void __user *)arg, buf, _IOC_SIZE(cmd)); 391 | + 392 | + return ret; 393 | +} 394 | + 395 | + 396 | +/* 397 | + * set of channel file operations for each nvscic2c channel. 398 | + * We do not support: read() and write() on nvscic2c channel 399 | + * descriptors. 400 | + */ 401 | +static const struct file_operations channel_fops = { 402 | + .owner = THIS_MODULE, 403 | + .open = channel_fops_open, 404 | + .release = channel_fops_release, 405 | + .mmap = channel_fops_mmap, 406 | + .unlocked_ioctl = channel_fops_ioctl, 407 | + .poll = channel_fops_poll, 408 | + .llseek = noop_llseek, 409 | +}; 410 | + 411 | + 412 | +/* 413 | + * helper function to implement NVSCIC2C_IOCTL_GET_INFO ioctl call. 414 | + * 415 | + * All important channel dev node properites required for user-space 416 | + * to map the channel memory and work without going to LKM for data 417 | + * xfer are exported in this ioctl implementation. 418 | + * 419 | + * Because we export 4 different memory for a single nvscic2c channel 420 | + * device, export the memory regions as masked offsets. 421 | + */ 422 | +static int ioctl_get_info_impl(struct channel_t *channel, 423 | + struct nvscic2c_info *get_info) 424 | +{ 425 | + /* actual offsets of 3 mem are not shared as we have to support 426 | + * multiple mmap for a single nvscic2c char device. 427 | + */ 428 | + get_info->nframes = channel->nframes; 429 | + get_info->frame_sz = channel->frame_sz; 430 | + get_info->xfer_type = channel->bulk_xfer_mode; 431 | + get_info->peer.offset = (PEER_MEM_MMAP << PAGE_SHIFT); 432 | + get_info->peer.size = channel->tx_mem.size; 433 | + get_info->self.offset = (SELF_MEM_MMAP << PAGE_SHIFT); 434 | + get_info->self.size = channel->rx_mem.size; 435 | + get_info->ctrl.offset = (CTRL_MEM_MMAP << PAGE_SHIFT); 436 | + get_info->ctrl.size = channel->ctrl_mem.size; 437 | + get_info->link.offset = (LINK_MEM_MMAP << PAGE_SHIFT); 438 | + get_info->link.size = link_mgmt_get_status_mem_size(); 439 | + // FIXME: remove this platform, added for unit testing. 440 | +#if defined(CONFIG_ARCH_TEGRA) 441 | + strcpy(get_info->platform, "tegra-umd"); 442 | +#elif defined(CONFIG_X86_64) 443 | + strcpy(get_info->platform, "x86-umd"); 444 | +#endif 445 | + return 0; 446 | +} 447 | + 448 | + 449 | +/* 450 | + * helper function to implement NVSCIC2C_IOCTL_NOTIFY_REMOTE ioctl call. 451 | + * 452 | + * like bulk xfer channels, where all DBs are not in use, we check if 453 | + * user-space SW asked to trigger a DB which the channel device has not 454 | + * configured, raise error. 455 | + * 456 | + * Otherwise, triggr peer DB ids by inferencing the mask user-space SW 457 | + * passed in argument and return the result. 458 | + */ 459 | +static int ioctl_notify_remote_impl(struct channel_t *channel, 460 | + uint8_t *db_bits) 461 | +{ 462 | + uint64_t set_db_bits = 0x0; 463 | + int ret = 0; 464 | + 465 | + if ((*db_bits) & (NVSCIC2C_NOTIFY_PRODUCER)) { 466 | + if (channel->prod_event_id == DB_ID_NIL) { 467 | + ERR("(%s): Prod DB idx unavailable\n", 468 | + channel->name); 469 | + return -EINVAL; 470 | + } 471 | + set_db_bits |= (1 << channel->prod_event_id); 472 | + } else if ((*db_bits) & (NVSCIC2C_NOTIFY_CONSUMER)) { 473 | + if (channel->cons_event_id == DB_ID_NIL) { 474 | + ERR("(%s): Cons DB idx unavailable\n", 475 | + channel->name); 476 | + return -EINVAL; 477 | + } 478 | + set_db_bits |= (1 << channel->cons_event_id); 479 | + } else if ((*db_bits) & (NVSCIC2C_NOTIFY_STATE)) { 480 | + if (channel->state_event_id == DB_ID_NIL) { 481 | + ERR("(%s): State DB idx unavailable\n", 482 | + channel->name); 483 | + return -EINVAL; 484 | + } 485 | + set_db_bits |= (1 << channel->state_event_id); 486 | + } else { 487 | + ERR("(%s): unrecognised notify remote ioctl cmd arg: 0x%x\n", 488 | + channel->name, *db_bits); 489 | + return -EINVAL; 490 | + } 491 | + 492 | + /* trigger the DB.*/ 493 | + ret = ntb_client_peer_db_set(set_db_bits); 494 | + if (ret) { 495 | + ERR("(%s): Failed to trigger peer db(s):(0x%08llx)\n", 496 | + channel->name, set_db_bits); 497 | + return ret; 498 | + } 499 | + 500 | + return 0; 501 | +} 502 | + 503 | + 504 | +/* Clean up the c2c channel devices. */ 505 | +static int remove_channel_device(struct channel_drv_ctx_t *ch_drv_ctx, 506 | + struct channel_t *channel) 507 | +{ 508 | + int ret = 0; 509 | + 510 | + if ((!ch_drv_ctx) 511 | + || (!channel)) { 512 | + return ret; 513 | + } 514 | + 515 | +#ifdef CONFIG_DEBUG_FS 516 | + channel_dbgfs_remove(channel); 517 | +#endif 518 | + 519 | + /* delink the channel DB associations.*/ 520 | + update_db_ch_tbl(ch_drv_ctx, channel, false); 521 | + 522 | + /* remove the channel device.*/ 523 | + if (channel->device) { 524 | + cdev_del(&channel->cdev); 525 | + device_del(channel->device); 526 | + channel->device = NULL; 527 | + } 528 | + 529 | + /* free the internal memory used for counter management.*/ 530 | + channel_free(channel); 531 | + 532 | + return ret; 533 | +} 534 | + 535 | + 536 | +/* Create the c2c channel devices for the user-space to: 537 | + * - Map the channel Self and Peer area. 538 | + * - send NTB DB notifications to remote/peer. 539 | + */ 540 | +static int add_channel_device(struct channel_drv_ctx_t *ch_drv_ctx, 541 | + struct channel_t *channel) 542 | +{ 543 | + int ret = 0; 544 | + 545 | + /* basic validation.*/ 546 | + if ((!ch_drv_ctx) 547 | + || (!channel)) { 548 | + ret = -EINVAL; 549 | + ERR("(%s): Invalid Params\n", __func__); 550 | + goto err; 551 | + } 552 | + 553 | + /* validate channel frames. As we map channel Rx and Tx to user-space, 554 | + * this must happen on PAGE boundaries. 555 | + */ 556 | + ret = validate_channel_params(channel); 557 | + if (ret) { 558 | + ERR("Failed to validate channel parameters\n"); 559 | + goto err; 560 | + } 561 | + 562 | + /* parition the whole of Self and Peer memory into the channel 563 | + * needs based on the frames/slots. 564 | + */ 565 | + ret = channel_alloc(channel, 566 | + &(ch_drv_ctx->self_mem_base), 567 | + &(ch_drv_ctx->peer_mem_base), 568 | + &(ch_drv_ctx->running_off)); 569 | + if (ret) { 570 | + ERR("(%s):Failed to allocate/initialise channel internals\n", 571 | + channel->name); 572 | + goto err; 573 | + } 574 | + 575 | + /* create the nvscic2c channel device - interface for user-space sw.*/ 576 | + channel->dev = MKDEV(MAJOR(ch_drv_ctx->char_dev), channel->minor); 577 | + cdev_init(&(channel->cdev), &(channel_fops)); 578 | + channel->cdev.owner = THIS_MODULE; 579 | + ret = cdev_add(&(channel->cdev), channel->dev, 1); 580 | + if (ret != 0) { 581 | + ERR("(%s): cdev_add() failed\n", channel->name); 582 | + goto err; 583 | + } 584 | + 585 | + /* parent is this hvd dev */ 586 | + channel->device = device_create(ch_drv_ctx->class, NULL, 587 | + channel->dev, channel, 588 | + channel->name); 589 | + if (IS_ERR(channel->device)) { 590 | + ret = PTR_ERR(channel->device); 591 | + ERR("(%s): device_create() failed\n", channel->name); 592 | + } 593 | + dev_set_drvdata(channel->device, channel); 594 | + 595 | + /* associate the NTB DB ids with the channel_device 596 | + * This is for delivering DB notifications to this channel. 597 | + * Enable those DBs too. 598 | + */ 599 | + ret = update_db_ch_tbl(ch_drv_ctx, channel, true); 600 | + if (ret) { 601 | + ERR("(%s): Failed to associate DB with channel\n", 602 | + channel->name); 603 | + goto err; 604 | + } 605 | + 606 | +#ifdef CONFIG_DEBUG_FS 607 | + channel_dbgfs_create(channel); 608 | +#endif 609 | + 610 | + /* all okay.*/ 611 | + return ret; 612 | + 613 | +err: 614 | + remove_channel_device(ch_drv_ctx, channel); 615 | + return ret; 616 | +} 617 | + 618 | + 619 | +/* 620 | + * Entry point for the nvscic2c channel char device sub-module/abstraction. 621 | + * 622 | + * On successful return (0), devices would have been created and ready to 623 | + * accept ioctls from user-space application. 624 | + * 625 | + * Mapping of each NTB doorbell to a C2C channel is also maintained here. 626 | + * 627 | + * We must come here after setting up the NTB client with PCIe shared memory 628 | + * (Self) and PCIe aperture(Peer) available. 629 | + */ 630 | +int channel_setup_devices(struct c2c_drv_ctx_t *drv_ctx) 631 | +{ 632 | + int ret = 0, i = 0; 633 | + struct channel_t *channel = NULL; 634 | + struct channel_param_t *param = NULL; 635 | + 636 | + /* basic validation.*/ 637 | + if ((!drv_ctx) 638 | + || (!drv_ctx->c2c_param.channel_nr)) { 639 | + ret = -EINVAL; 640 | + ERR("(%s): Invalid Params\n", __func__); 641 | + goto err; 642 | + } 643 | + 644 | + /* start by allocating the c2c channel driver ctx.*/ 645 | + ch_drv_ctx = kzalloc(sizeof(*ch_drv_ctx), GFP_KERNEL); 646 | + if (!ch_drv_ctx) { 647 | + ret = -ENOMEM; 648 | + ERR("Failed to allocate channel driver ctx\n"); 649 | + goto err; 650 | + } 651 | + ch_drv_ctx->channel_nr = drv_ctx->c2c_param.channel_nr; 652 | + memcpy(&(ch_drv_ctx->self_mem_base), &(drv_ctx->self_mem), 653 | + sizeof(ch_drv_ctx->self_mem_base)); 654 | + memcpy(&(ch_drv_ctx->peer_mem_base), &(drv_ctx->peer_mem), 655 | + sizeof(ch_drv_ctx->peer_mem_base)); 656 | + 657 | + /* allocate the whole chardev range */ 658 | + ret = alloc_chrdev_region(&(ch_drv_ctx->char_dev), 0, 659 | + ch_drv_ctx->channel_nr, MODULE_NAME); 660 | + if (ret < 0) { 661 | + ERR("(%s): alloc_chrdev_region() failed\n", __func__); 662 | + goto err; 663 | + } 664 | + 665 | + ch_drv_ctx->class = class_create(THIS_MODULE, MODULE_NAME); 666 | + if (IS_ERR_OR_NULL(ch_drv_ctx->class)) { 667 | + ret = PTR_ERR(ch_drv_ctx->class); 668 | + ERR("Failed to create channel char class: %ld\n", 669 | + PTR_ERR(ch_drv_ctx->class)); 670 | + goto err; 671 | + } 672 | + 673 | + /* allocate char devices context for supported channels.*/ 674 | + ch_drv_ctx->channels = kzalloc((ch_drv_ctx->channel_nr * 675 | + sizeof(*ch_drv_ctx->channels)), 676 | + GFP_KERNEL); 677 | + if (!ch_drv_ctx->channels) { 678 | + ret = -ENOMEM; 679 | + ERR("Failed to allocate channel char devices array\n"); 680 | + goto err; 681 | + } 682 | + 683 | + /* create the NTB DB<->Channel association table. 684 | + * start by querying the supported DB by NTB client. 685 | + */ 686 | + mutex_init(&ch_drv_ctx->db_ch_tbl_lock); 687 | + ch_drv_ctx->db_vec_nr = ntb_client_db_vector_count(); 688 | + if (ch_drv_ctx->db_vec_nr <= 0) { 689 | + ret = -EINVAL; 690 | + ERR("NTB DB vector count:(%u) invalid\n", 691 | + ch_drv_ctx->db_vec_nr); 692 | + goto err; 693 | + } 694 | + DBG("NTB module has DB vecs:(%d)\n", ch_drv_ctx->db_vec_nr); 695 | + 696 | + ch_drv_ctx->db_ch_tbl = kzalloc((ch_drv_ctx->db_vec_nr * 697 | + sizeof(struct channel_t *)), 698 | + GFP_KERNEL); 699 | + if (!ch_drv_ctx->db_ch_tbl) { 700 | + ret = -ENOMEM; 701 | + ERR("Failed to allocate NTB channel table\n"); 702 | + goto err; 703 | + } 704 | + 705 | + /* create char devices for each channel.*/ 706 | + for (i = 0; i < ch_drv_ctx->channel_nr; i++) { 707 | + channel = &(ch_drv_ctx->channels[i]); 708 | + param = &(drv_ctx->c2c_param.ch_params[i]); 709 | + 710 | + /* copy the parameters from nvscic2c driver ctx.*/ 711 | + channel->minor = param->ch_id; 712 | + channel->event_type = param->event_type; 713 | + channel->prod_event_id = param->prod_event_id; 714 | + channel->cons_event_id = param->cons_event_id; 715 | + channel->state_event_id = param->state_event_id; 716 | + channel->nframes = param->nframes; 717 | + channel->frame_sz = param->frame_sz; 718 | + channel->align = param->align; 719 | + channel->bulk_xfer_mode = param->bulk_xfer_mode; 720 | + strcpy(channel->name, param->ch_name); 721 | + 722 | + /* create nvscic2c channel device.*/ 723 | + ret = add_channel_device(ch_drv_ctx, channel); 724 | + if (ret) { 725 | + ERR("Failed setting up nvscic2c device: (%s)\n", 726 | + channel->name); 727 | + goto err; 728 | + } 729 | + } 730 | + 731 | + /* all okay.*/ 732 | + return ret; 733 | + 734 | +err: 735 | + channel_release_devices(drv_ctx); 736 | + return ret; 737 | +} 738 | + 739 | + 740 | +/* exit point for nvscic2c channel char device sub-module/abstraction.*/ 741 | +int channel_release_devices(struct c2c_drv_ctx_t *drv_ctx) 742 | +{ 743 | + int ret = 0, i = 0; 744 | + 745 | + if (!ch_drv_ctx) 746 | + return ret; 747 | + 748 | + /* remove all the channel char devices.*/ 749 | + if (ch_drv_ctx->channels) { 750 | + for (i = 0; i < ch_drv_ctx->channel_nr; i++) { 751 | + struct channel_t *channel = &(ch_drv_ctx->channels[i]); 752 | + 753 | + remove_channel_device(ch_drv_ctx, channel); 754 | + } 755 | + kfree(ch_drv_ctx->channels); 756 | + ch_drv_ctx->channels = NULL; 757 | + } 758 | + 759 | + kfree(ch_drv_ctx->db_ch_tbl); 760 | + ch_drv_ctx->db_ch_tbl = NULL; 761 | + 762 | + mutex_destroy(&ch_drv_ctx->db_ch_tbl_lock); 763 | + 764 | + if (ch_drv_ctx->class) { 765 | + class_destroy(ch_drv_ctx->class); 766 | + ch_drv_ctx->class = NULL; 767 | + } 768 | + 769 | + if (ch_drv_ctx->char_dev) { 770 | + unregister_chrdev_region(ch_drv_ctx->char_dev, 771 | + ch_drv_ctx->channel_nr); 772 | + ch_drv_ctx->char_dev = 0; 773 | + } 774 | + 775 | + kfree(ch_drv_ctx); 776 | + ch_drv_ctx = NULL; 777 | + 778 | + return ret; 779 | +} 780 | + 781 | + 782 | +/* 783 | + * Function called by nvscic2c module.c on seeing a change in the 784 | + * NTB link status. Here we pass on this event to each channel 785 | + * which is required for their poll() implementation. 786 | + * 787 | + * This is supposed to be called only change in link status not 788 | + * for every NTB link event(hb). 789 | + */ 790 | +int channel_link_event(enum link_status status) 791 | +{ 792 | + int i = 0; 793 | + struct channel_t *channel = NULL; 794 | + 795 | + if (!ch_drv_ctx) { 796 | + ERR("(%s): channel abstraction not ready yet\n", __func__); 797 | + return -EINVAL; 798 | + } 799 | + 800 | + /* pass this event to all the channels. */ 801 | + for (i = 0; i < ch_drv_ctx->channel_nr; i++) { 802 | + channel = &(ch_drv_ctx->channels[i]); 803 | + 804 | + /* make poll() look at ntb link status again if waiting.*/ 805 | + atomic_inc(&(channel->link_change_event)); 806 | + wake_up_interruptible_all(&(channel->waitq)); 807 | + } 808 | + 809 | + return 0; 810 | +} 811 | + 812 | + 813 | +/* 814 | + * Function called by NTB client(ntb-client) on getting a NTB DB 815 | + * event. 816 | + * 817 | + * We receive the DB vector/index which triggered this event. 818 | + * We internally go through the channel and db association and 819 | + * pass on the db event to relevant channel. 820 | + * 821 | + * A channel will/may have multiple DB's but we should be get only 822 | + * 1 DB per event callback. 823 | + */ 824 | +int channel_db_event(int db_idx) 825 | +{ 826 | + struct channel_t *channel = NULL; 827 | + 828 | + /* validate.*/ 829 | + if (!ch_drv_ctx) { 830 | + ERR("(%s): channel abstraction not ready\n", __func__); 831 | + return -EINVAL; 832 | + } 833 | + 834 | + /* validate. 835 | + * we shall not read the db_mask on each call and verify if db was 836 | + * enabled. If it wasn't db_ch_tbl will have NULL for it. 837 | + */ 838 | + if (db_idx >= ch_drv_ctx->db_vec_nr) { 839 | + ERR("%s): Unexpected DB vec received.\n", __func__); 840 | + return -EINVAL; 841 | + } 842 | + 843 | + channel = ch_drv_ctx->db_ch_tbl[db_idx]; 844 | + if (!channel) { 845 | + ERR("(%s): No channel is associated to db idx:(%d)\n", 846 | + __func__, db_idx); 847 | + return -EINVAL; 848 | + } 849 | + 850 | + /* make poll() look at data counters or state change if waiting.*/ 851 | + if ((channel->state_event_id == db_idx) 852 | + || (channel->prod_event_id == db_idx) 853 | + || (channel->cons_event_id == db_idx)) { 854 | + /* this is for channel state change event.*/ 855 | + atomic_inc(&(channel->db_event)); 856 | + wake_up_interruptible_all(&(channel->waitq)); 857 | + } 858 | + 859 | + return 0; 860 | +} 861 | + 862 | + 863 | +/* 864 | + * helper function to update the NTB doorbell and channel association 865 | + * for a single DB index. 866 | + * 867 | + * Will do nothing for DB which is beyond the supported DBs like (DB_ID_NIL). 868 | + * 869 | + * IMPORTATNT: must be called with lock held to serialise access to the 870 | + * table. 871 | + */ 872 | +static int update_db_ch_tbl_entry(struct channel_drv_ctx_t *ch_drv_ctx, 873 | + uint8_t db_idx, struct channel_t *channel, 874 | + bool add) 875 | +{ 876 | + int ret = 0; 877 | + struct channel_t *reg_ch = NULL; 878 | + 879 | + /* checks for DB ids requested. 880 | + * This covers the case for bulk channels, were db id can be DB_ID_NIL. 881 | + */ 882 | + if (db_idx < ch_drv_ctx->db_vec_nr) { 883 | + if (add) { 884 | + reg_ch = ch_drv_ctx->db_ch_tbl[db_idx]; 885 | + if (reg_ch) { 886 | + ret = -EINVAL; 887 | + ERR("(%s): DB:(%d) pre-registered with: (%s)\n", 888 | + channel->name, db_idx, 889 | + reg_ch->name); 890 | + } else { 891 | + ret = ntb_client_db_clear((1 << db_idx)); 892 | + ret |= ntb_client_db_clear_mask((1 << db_idx)); 893 | + if (ret) { 894 | + ERR("(%s): Err enabling NTB DB: (%d)\n", 895 | + channel->name, db_idx); 896 | + } else { 897 | + ch_drv_ctx->db_ch_tbl[db_idx] = channel; 898 | + } 899 | + } 900 | + } else { 901 | + ntb_client_db_set_mask((1 << db_idx)); 902 | + ntb_client_db_clear((1 << db_idx)); 903 | + ch_drv_ctx->db_ch_tbl[db_idx] = NULL; 904 | + } 905 | + } else if ((db_idx != DB_ID_NIL) 906 | + && (add)) { 907 | + /* all DB's other than DB_ID_NIL(invalid for bulk channels). 908 | + * return error only for addition. 909 | + */ 910 | + ret = -EINVAL; 911 | + ERR("(%s): DB idx:(%d) un-supported. Supported DB idx:[0-%d]\n", 912 | + channel->name, db_idx, 913 | + (ch_drv_ctx->db_vec_nr - 1)); 914 | + } 915 | + 916 | + return ret; 917 | +} 918 | + 919 | + 920 | +/* 921 | + * To maintain an association of NTB doorbell and channels, 922 | + * This is required when we get any notification callback via NTB DB vector 923 | + * to identify which channel it belongs to. Therefore, DB must be associated 924 | + * to one channel, although a channel may have multiple DBs. 925 | + * 926 | + * We must come here after having queried the NTB driver for supported DBs. 927 | + */ 928 | +static int update_db_ch_tbl(struct channel_drv_ctx_t *ch_drv_ctx, 929 | + struct channel_t *channel, bool add) 930 | +{ 931 | + int ret = 0; 932 | + 933 | + /* basic validation.*/ 934 | + if ((!ch_drv_ctx) 935 | + || (!channel)) { 936 | + ret = -EINVAL; 937 | + ERR("(%s): Invalid Params\n", __func__); 938 | + goto err; 939 | + } 940 | + 941 | + mutex_lock(&(ch_drv_ctx->db_ch_tbl_lock)); 942 | + ret = update_db_ch_tbl_entry(ch_drv_ctx, 943 | + channel->prod_event_id, channel, add); 944 | + ret |= update_db_ch_tbl_entry(ch_drv_ctx, 945 | + channel->cons_event_id, channel, add); 946 | + ret |= update_db_ch_tbl_entry(ch_drv_ctx, 947 | + channel->state_event_id, channel, add); 948 | + mutex_unlock(&(ch_drv_ctx->db_ch_tbl_lock)); 949 | + 950 | +err: 951 | + return ret; 952 | +} 953 | diff --git a/drivers/misc/nvscic2c/channel-dbgfs.c b/drivers/misc/nvscic2c/channel-dbgfs.c 954 | new file mode 100644 955 | index 0000000..9cd5779 956 | --- /dev/null 957 | +++ b/drivers/misc/nvscic2c/channel-dbgfs.c 958 | @@ -0,0 +1,518 @@ 959 | +/* 960 | + * Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved. 961 | + * 962 | + * This program is free software; you can redistribute it and/or modify it 963 | + * under the terms and conditions of the GNU General Public License, 964 | + * version 2, as published by the Free Software Foundation. 965 | + * 966 | + * This program is distributed in the hope it will be useful, but WITHOUT 967 | + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 968 | + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for 969 | + * more details. 970 | + */ 971 | + 972 | +#include "channel.h" 973 | +#include "chip-to-chip.h" 974 | +#include 975 | +#include 976 | +#include 977 | +#include 978 | +#include 979 | +#include 980 | +#include 981 | +#include 982 | + 983 | + 984 | +/* 985 | + * This is sample dbgfs interface to test read/write from user-space to 986 | + * LKM for channel memories and vice-versa. 987 | + * 988 | + * This abstraction is not capable of handling LINK event changes and 989 | + * requires LINK_UP before memory region/read/write can start. 990 | + * 991 | + * The parameters: offset, ptich must be same across write and read runs 992 | + * across SoCs. 993 | + */ 994 | + 995 | + 996 | +/* aligned to 4.*/ 997 | +#define DATA_LEN (1024) 998 | +#define MAX_DATA_LEN (PAGE_SIZE) 999 | +#define DEFAULT_ITERATION (512) 1000 | + 1001 | +/* 1002 | + * packet header. Paylod follows immediately after this header. 1003 | + */ 1004 | +struct header_t { 1005 | + /* sequence number. */ 1006 | + uint64_t seq; 1007 | + 1008 | + /* length of data: DATA_LEN 1009 | + * written only during write. 1010 | + * Next packet starts at header+size. 1011 | + */ 1012 | + uint64_t size; 1013 | +}; 1014 | + 1015 | + 1016 | +/* 1017 | + * Helper function to make code concise and do the 1018 | + * common validations applicable to all reaed/write 1019 | + * file operations. Because we would write/read on Peer memory 1020 | + * over PCIe, check for NTB link status. 1021 | + */ 1022 | +static int common_validations(struct channel_t *channel) 1023 | +{ 1024 | + /* if context/private driver data was valid. */ 1025 | + if (!channel) { 1026 | + ERR("dbgfs: Channel drv. data NULL\n"); 1027 | + return -EINVAL; 1028 | + } 1029 | + 1030 | + if (!channel->tx_mem.pva) { 1031 | + ERR("dbgfs: Local memory NULL\n"); 1032 | + return -EINVAL; 1033 | + } 1034 | + 1035 | + if (!channel->rx_mem.pva) { 1036 | + ERR("dbgfs: Remote aperture NULL\n"); 1037 | + return -EINVAL; 1038 | + } 1039 | + 1040 | + if (!channel->ctrl_mem.pva) { 1041 | + ERR("dbgfs: Remote aperture NULL\n"); 1042 | + return -EINVAL; 1043 | + } 1044 | + 1045 | + return 0; 1046 | +} 1047 | + 1048 | + 1049 | +static int write_mem(struct channel_t *channel, void *pva, size_t size, 1050 | + bool iomem) 1051 | +{ 1052 | + off_t off = 0; 1053 | + void *packet = NULL; 1054 | + char *data = NULL; 1055 | + uint32_t write = 0; 1056 | + size_t packet_size = 0; 1057 | + struct header_t *header = NULL; 1058 | + struct header_t null_header = {0}; 1059 | +#if defined(CONFIG_ARCH_TEGRA) 1060 | + char *platform = "tegra-lkm"; 1061 | +#elif defined(CONFIG_X86_64) 1062 | + char *platform = "x86-lkm"; 1063 | +#endif 1064 | + 1065 | + /* packet = header + data payload.*/ 1066 | + packet_size = sizeof(*header) + DATA_LEN; 1067 | + packet = kzalloc(packet_size, GFP_KERNEL); 1068 | + if (!packet) 1069 | + return write; 1070 | + 1071 | + header = (struct header_t *)(packet); 1072 | + data = (char *)(packet) + sizeof(*header); 1073 | + 1074 | + /* we need space for one header(null/eos) also.*/ 1075 | + while (((off + packet_size) < (size - sizeof(*header))) 1076 | + && (write < channel->dbgfs_iteration)) { 1077 | + /* packet header. */ 1078 | + header->seq = write; 1079 | + header->size = DATA_LEN; 1080 | + 1081 | + /* packet data. */ 1082 | + snprintf(data, (DATA_LEN-1), "(%s): (%s): (%05u): (%lld)", 1083 | + platform, channel->name, write, 1084 | + ktime_to_ns(ktime_get())); 1085 | + 1086 | + /* write.*/ 1087 | + if (iomem) 1088 | + memcpy_toio((pva + off), packet, packet_size); 1089 | + else 1090 | + memcpy((pva + off), packet, packet_size); 1091 | + 1092 | + write++; 1093 | + off += packet_size; 1094 | + } 1095 | + 1096 | + /* write null packet.*/ 1097 | + null_header.seq = write; 1098 | + null_header.size = 0; 1099 | + if (iomem) { 1100 | + memcpy_toio((pva + off), &(null_header), 1101 | + sizeof(null_header)); 1102 | + } else { 1103 | + memcpy((pva + off), &(null_header), 1104 | + sizeof(null_header)); 1105 | + } 1106 | + 1107 | + kfree(packet); 1108 | + return write; 1109 | +} 1110 | + 1111 | + 1112 | +static int read_mem(struct channel_t *channel, void *pva, size_t size, 1113 | + bool iomem) 1114 | +{ 1115 | + off_t off = 0; 1116 | + void *data = NULL; 1117 | + uint32_t read = 0; 1118 | + struct header_t header = {0}; 1119 | + 1120 | + /* maximum allocation.*/ 1121 | + data = kzalloc(MAX_DATA_LEN, GFP_KERNEL); 1122 | + if (!data) 1123 | + return read; 1124 | + 1125 | + while (((off + sizeof(header)) < size) 1126 | + && (read < channel->dbgfs_iteration)) { 1127 | + if (iomem) { 1128 | + memcpy_fromio(&header, (pva + off), sizeof(header)); 1129 | + off += sizeof(header); 1130 | + if ((header.size) 1131 | + && (header.size <= MAX_DATA_LEN) 1132 | + && ((off + header.size) < size)) { 1133 | + memcpy_fromio(data, (pva + off), header.size); 1134 | + off += header.size; 1135 | + } else 1136 | + break; 1137 | + } else { 1138 | + memcpy(&header, (pva + off), sizeof(header)); 1139 | + off += sizeof(header); 1140 | + if ((header.size) 1141 | + && (header.size <= MAX_DATA_LEN) 1142 | + && ((off + header.size) < size)) { 1143 | + memcpy(data, (pva + off), header.size); 1144 | + off += header.size; 1145 | + } else 1146 | + break; 1147 | + } 1148 | + DBG("dbgfs: (%s): (%05llu): (%s)\n", 1149 | + channel->name, header.seq, (char *)data); 1150 | + read++; 1151 | + } 1152 | + 1153 | + kfree(data); 1154 | + return read; 1155 | +} 1156 | + 1157 | + 1158 | +/* 1159 | + * Debugfs interface for user to issue write command that will 1160 | + * go and touch self memory region for iterations times and write sample bytes. 1161 | + */ 1162 | +static int self_mem_write(void *data, u64 val) 1163 | +{ 1164 | + struct channel_t *channel = NULL; 1165 | + int packets = 0; 1166 | + 1167 | + channel = (struct channel_t *)(data); 1168 | + 1169 | + if (common_validations(channel)) 1170 | + return -EINVAL; 1171 | + 1172 | + DBG("dbgfs: (%s): writing self memory region\n", channel->name); 1173 | + packets = write_mem(channel, channel->rx_mem.pva, channel->rx_mem.size, 1174 | + false); 1175 | + DBG("dbgfs: (%s): Write:(%d) packets on self memory region.\n", 1176 | + channel->name, packets); 1177 | + 1178 | + return 0; 1179 | +} 1180 | + 1181 | + 1182 | +/* 1183 | + * Debugfs interface for user to issue read command that will 1184 | + * go and read self memory region for iterations times and read the 1185 | + * sample bytes. 1186 | + */ 1187 | +static int self_mem_read(void *data, u64 *val) 1188 | +{ 1189 | + struct channel_t *channel = NULL; 1190 | + int packets = 0; 1191 | + 1192 | + channel = (struct channel_t *)(data); 1193 | + 1194 | + if (common_validations(channel)) 1195 | + return -EINVAL; 1196 | + 1197 | + DBG("dbgfs: (%s): reading self memory region\n", channel->name); 1198 | + packets = read_mem(channel, channel->rx_mem.pva, channel->rx_mem.size, 1199 | + false); 1200 | + DBG("dbgfs: (%s): Read:(%d) packets from self memory region.\n", 1201 | + channel->name, packets); 1202 | + 1203 | + *val = packets; 1204 | + return 0; 1205 | +} 1206 | + 1207 | + 1208 | +/* 1209 | + * Debugfs interface for user to issue write command that will 1210 | + * go and touch Peer memory region for iterations times and write sample bytes. 1211 | + */ 1212 | +static int peer_mem_write(void *data, u64 val) 1213 | +{ 1214 | + struct channel_t *channel = NULL; 1215 | + int packets = 0; 1216 | + 1217 | + channel = (struct channel_t *)(data); 1218 | + 1219 | + if (common_validations(channel)) 1220 | + return -EINVAL; 1221 | + 1222 | + DBG("dbgfs: (%s): writing peer memory region\n", channel->name); 1223 | + if (link_mgmt_get_link_status() != LINK_UP) { 1224 | + INFO("dbgfs: (%s): Skipping write - NTB link not up.\n", 1225 | + channel->name); 1226 | + } else { 1227 | + DBG("dbgfs: (%s): NTB link is UP\n", channel->name); 1228 | + packets = write_mem(channel, channel->tx_mem.pva, 1229 | + channel->tx_mem.size, true); 1230 | + DBG("dbgfs: (%s): Write:(%d) packets on peer memory region.\n", 1231 | + channel->name, packets); 1232 | + } 1233 | + 1234 | + return 0; 1235 | +} 1236 | + 1237 | + 1238 | +/* 1239 | + * Debugfs interface for user to issue read command that will 1240 | + * go and read Peer memory region for iterations times and read the 1241 | + * sample bytes. 1242 | + */ 1243 | +static int peer_mem_read(void *data, u64 *val) 1244 | +{ 1245 | + struct channel_t *channel = NULL; 1246 | + int packets = 0; 1247 | + 1248 | + channel = (struct channel_t *)(data); 1249 | + 1250 | + if (common_validations(channel)) 1251 | + return -EINVAL; 1252 | + 1253 | + DBG("dbgfs: (%s): reading peer memory region\n", channel->name); 1254 | + if (link_mgmt_get_link_status() != LINK_UP) { 1255 | + INFO("dbgfs: (%s): Skipping read - NTB link not up.\n", 1256 | + channel->name); 1257 | + } else { 1258 | + DBG("dbgfs: (%s): NTB link is UP\n", channel->name); 1259 | + packets = read_mem(channel, channel->tx_mem.pva, 1260 | + channel->tx_mem.size, true); 1261 | + DBG("dbgfs: (%s): Read:(%d) packets from peer memory region.\n", 1262 | + channel->name, packets); 1263 | + } 1264 | + 1265 | + *val = packets; 1266 | + return 0; 1267 | +} 1268 | + 1269 | + 1270 | +/* 1271 | + * Debugfs interface for user to issue write command that will 1272 | + * go and touch ctrl memory region for iterations times and write sample bytes. 1273 | + */ 1274 | +static int ctrl_mem_write(void *data, u64 val) 1275 | +{ 1276 | + int packets = 0; 1277 | + struct channel_t *channel = NULL; 1278 | + 1279 | + channel = (struct channel_t *)(data); 1280 | + 1281 | + if (common_validations(channel)) 1282 | + return -EINVAL; 1283 | + 1284 | + DBG("dbgfs: (%s): writing ctrl memory region\n", channel->name); 1285 | + packets = write_mem(channel, channel->ctrl_mem.pva, 1286 | + channel->ctrl_mem.size, false); 1287 | + DBG("dbgfs: (%s): Write:(%d) packets on ctrl memory region.\n", 1288 | + channel->name, packets); 1289 | + 1290 | + return 0; 1291 | +} 1292 | + 1293 | + 1294 | +/* 1295 | + * Debugfs interface for user to issue read command that will 1296 | + * go and read ctrl memory region for iterations times and read the 1297 | + * sample bytes. 1298 | + */ 1299 | +static int ctrl_mem_read(void *data, u64 *val) 1300 | +{ 1301 | + int packets = 0; 1302 | + struct channel_t *channel = NULL; 1303 | + 1304 | + channel = (struct channel_t *)(data); 1305 | + 1306 | + if (common_validations(channel)) 1307 | + return -EINVAL; 1308 | + 1309 | + DBG("dbgfs: (%s): reading ctrl memory region\n", channel->name); 1310 | + packets = read_mem(channel, channel->ctrl_mem.pva, 1311 | + channel->ctrl_mem.size, false); 1312 | + DBG("dbgfs: (%s): Read:(%d) packets from ctrl memory region.\n", 1313 | + channel->name, packets); 1314 | + 1315 | + *val = packets; 1316 | + return 0; 1317 | +} 1318 | + 1319 | + 1320 | + 1321 | +/* 1322 | + * Debugfs interface to set the number of iterations. Running for the entire 1323 | + * memory window of self and peer can lead to lot of noise. User can use 1324 | + * iterations to run through entire memory. 1325 | + */ 1326 | +static int set_iteration(void *data, u64 val) 1327 | +{ 1328 | + struct channel_t *channel = NULL; 1329 | + 1330 | + channel = (struct channel_t *)(data); 1331 | + 1332 | + if (common_validations(channel)) 1333 | + return -EINVAL; 1334 | + 1335 | + if (val <= 0) 1336 | + return -EINVAL; 1337 | + 1338 | + channel->dbgfs_iteration = val; 1339 | + 1340 | + return 0; 1341 | +} 1342 | + 1343 | + 1344 | +/*No use - added for completeness. */ 1345 | +static int get_iteration(void *data, u64 *val) 1346 | +{ 1347 | + struct channel_t *channel = NULL; 1348 | + 1349 | + channel = (struct channel_t *)(data); 1350 | + 1351 | + if (common_validations(channel)) 1352 | + return -EINVAL; 1353 | + 1354 | + *val = channel->dbgfs_iteration; 1355 | + return 0; 1356 | +} 1357 | + 1358 | +DEFINE_SIMPLE_ATTRIBUTE(self_mem_fops, self_mem_read, self_mem_write, "%llu\n"); 1359 | +DEFINE_SIMPLE_ATTRIBUTE(peer_mem_fops, peer_mem_read, peer_mem_write, "%llu\n"); 1360 | +DEFINE_SIMPLE_ATTRIBUTE(ctrl_mem_fops, ctrl_mem_read, ctrl_mem_write, "%llu\n"); 1361 | +DEFINE_SIMPLE_ATTRIBUTE(iteration_fops, get_iteration, set_iteration, "%llu\n"); 1362 | + 1363 | + 1364 | +/* 1365 | + * to clean up the debugfs interface. 1366 | + */ 1367 | +void channel_dbgfs_remove(struct channel_t *channel) 1368 | +{ 1369 | + if (channel) { 1370 | + /* Remove the debugfs directory and it's files recursively. */ 1371 | + debugfs_remove_recursive(channel->dbgfs_root); 1372 | + channel->dbgfs_root = NULL; 1373 | + DBG("dbgfs: (%s) debug device removed\n", channel->name); 1374 | + } 1375 | +} 1376 | + 1377 | + 1378 | +/* 1379 | + * Helper API to create the Debugfs interface for each nvscic2c 1380 | + * channel device and enable MANUAL VERFICATION of read/write from 1381 | + * user-space to LKM and vice-versa for same memory: Peer(Tx), Self(Rx), 1382 | + * Ctrl and Link. 1383 | + * 1384 | + * purely for debugging purpose. 1385 | + */ 1386 | +int channel_dbgfs_create(struct channel_t *channel) 1387 | +{ 1388 | + struct dentry *d = NULL; 1389 | + size_t required_size = sizeof(struct header_t); // NULL header. 1390 | + 1391 | + /* validation. */ 1392 | + if (!channel) { 1393 | + ERR("dbgfs: Invalid params.\n"); 1394 | + return -EINVAL; 1395 | + } 1396 | + 1397 | + if (!debugfs_initialized()) { 1398 | + ERR("dbgfs: Debugfs isn't initialised yet\n"); 1399 | + return -ENOTSUPP; 1400 | + } 1401 | + 1402 | + if ((sizeof(struct header_t) & 0x03) 1403 | + || (DATA_LEN & 0x03)) { 1404 | + ERR("dbgfs: header must be 4 byte size aligned.\n"); 1405 | + return -EINVAL; 1406 | + } 1407 | + 1408 | + 1409 | + /* create parent debugfs directory under debugfs root file-system.*/ 1410 | + d = debugfs_create_dir(channel->name, NULL); 1411 | + if (!d) { 1412 | + ERR("dbgfs: Failed to create DebugFs root dir\n"); 1413 | + return -ENOMEM; 1414 | + } 1415 | + 1416 | + /* we need space for atleast one packet header(null/eos).*/ 1417 | + if (channel->ctrl_mem.size < required_size) { 1418 | + ERR("dbgfs: Ctrl memory size less than required\n"); 1419 | + return -ENOMEM; 1420 | + } 1421 | + if (channel->tx_mem.size < required_size) { 1422 | + ERR("dbgfs: Peer memory size less than required\n"); 1423 | + return -ENOMEM; 1424 | + } 1425 | + if (channel->rx_mem.size < required_size) { 1426 | + ERR("dbgfs: Self memory size less than required\n"); 1427 | + return -ENOMEM; 1428 | + } 1429 | + 1430 | + channel->dbgfs_root = d; 1431 | + channel->dbgfs_iteration = DEFAULT_ITERATION; 1432 | + 1433 | + /* create a file node to issue read/write across memories. */ 1434 | + d = debugfs_create_file("self_mem", 0664, 1435 | + channel->dbgfs_root, channel, 1436 | + &(self_mem_fops)); 1437 | + if (!d) { 1438 | + ERR("dbgfs: verify debugfs intf failed\n"); 1439 | + goto err; 1440 | + } 1441 | + d = debugfs_create_file("peer_mem", 0664, 1442 | + channel->dbgfs_root, channel, 1443 | + &(peer_mem_fops)); 1444 | + if (!d) { 1445 | + ERR("dbgfs: verify debugfs intf failed\n"); 1446 | + goto err; 1447 | + } 1448 | + d = debugfs_create_file("ctrl_mem", 0664, 1449 | + channel->dbgfs_root, channel, 1450 | + &(ctrl_mem_fops)); 1451 | + if (!d) { 1452 | + ERR("dbgfs: verify debugfs intf failed\n"); 1453 | + goto err; 1454 | + } 1455 | + 1456 | + /* create a file node to set/query the starting offset for writing 1457 | + * sample bytes in all 3 memory regions. 1458 | + */ 1459 | + d = debugfs_create_file("iteartion", 0664, 1460 | + channel->dbgfs_root, channel, 1461 | + &(iteration_fops)); 1462 | + if (!d) { 1463 | + ERR("dbgfs: offset debugfs intf failed\n"); 1464 | + goto err; 1465 | + } 1466 | + 1467 | + DBG("dbgfs: (%s) debug device created\n", channel->name); 1468 | + return 0; 1469 | + 1470 | +err: 1471 | + /* Remove the debugfs directory and it's files recursively. */ 1472 | + debugfs_remove_recursive(channel->dbgfs_root); 1473 | + channel->dbgfs_root = NULL; 1474 | + 1475 | + return -ENOMEM; 1476 | +} 1477 | diff --git a/drivers/misc/nvscic2c/channel-ops.c b/drivers/misc/nvscic2c/channel-ops.c 1478 | new file mode 100644 1479 | index 0000000..77ae4ad 1480 | --- /dev/null 1481 | +++ b/drivers/misc/nvscic2c/channel-ops.c 1482 | @@ -0,0 +1,249 @@ 1483 | +/* 1484 | + * Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved. 1485 | + * 1486 | + * This program is free software; you can redistribute it and/or modify it 1487 | + * under the terms and conditions of the GNU General Public License, 1488 | + * version 2, as published by the Free Software Foundation. 1489 | + * 1490 | + * This program is distributed in the hope it will be useful, but WITHOUT 1491 | + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 1492 | + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for 1493 | + * more details. 1494 | + */ 1495 | + 1496 | +#include "channel.h" 1497 | +#include "chip-to-chip.h" 1498 | +#include 1499 | +#include 1500 | +#include 1501 | +#include 1502 | +#include 1503 | +#include 1504 | +#include 1505 | +#include 1506 | +#include 1507 | +#include 1508 | +#include 1509 | + 1510 | + 1511 | +/* prototype.*/ 1512 | +static int 1513 | +validate_size(struct channel_t *channel, 1514 | + struct dma_buff_t *self_mem_base, 1515 | + struct pci_mmio_t *peer_mem_base, 1516 | + size_t offset, size_t ch_sz); 1517 | + 1518 | +/* prototype. */ 1519 | +static void print_channel_info(struct channel_t *channel); 1520 | + 1521 | +/* 1522 | + * helper function to validate the nvscic2c channel parameters 1523 | + * as parsed from config.c or DT. 1524 | + */ 1525 | +int validate_channel_params(struct channel_t *channel) 1526 | +{ 1527 | + int ret = -EINVAL; 1528 | + 1529 | + /* valid entries for frames/slots of channel.*/ 1530 | + if ((!channel->nframes) 1531 | + || (!channel->frame_sz)) { 1532 | + ERR("(%s): Invalid Channel frame properties\n", 1533 | + channel->name); 1534 | + goto err; 1535 | + } 1536 | + 1537 | + /* channel total mem footprint must be aligned to PAGE_SIZE 1538 | + * as we map channel's Peer and Self memories to user-space 1539 | + * on PAGE boundaries. 1540 | + */ 1541 | + if (channel->align & (~PAGE_MASK)) { 1542 | + ERR("(%s): Alignment must be multiple of PAGE_SIZE:(0x%lx)\n", 1543 | + channel->name, PAGE_SIZE); 1544 | + goto err; 1545 | + } 1546 | + 1547 | + /* Each transfer unit across PCIe must be aligned to 4 bytes.*/ 1548 | + if (channel->frame_sz & (0x03U)) { 1549 | + ERR("(%s): Channel frame size must be 4 byte aligned\n", 1550 | + channel->name); 1551 | + goto err; 1552 | + } 1553 | + 1554 | + /* all okay.*/ 1555 | + ret = 0; 1556 | +err: 1557 | + return ret; 1558 | +} 1559 | + 1560 | + 1561 | +/* 1562 | + * Allocate and initialise each nvscic2c channel internals with 3 different 1563 | + * memories: Tx(PCIe Aperture), Rx (PCIe Shared Mem) and CtrlMem for maintaining 1564 | + * control flow information not exposed to Peer. 1565 | + * 1566 | + * Fragment the nvscic2c memory: peer and self into channels starting 1567 | + * at provided offset. Also, update the offset with the channel size 1568 | + * requirements to be able to start next channel from there. 1569 | + * 1570 | + * Not thread-safe. 1571 | + */ 1572 | +int channel_alloc(struct channel_t *channel, 1573 | + struct dma_buff_t *self_mem_base, 1574 | + struct pci_mmio_t *peer_mem_base, 1575 | + off_t *curr_off) 1576 | +{ 1577 | + int ret = 0; 1578 | + int ch_sz = 0; 1579 | + 1580 | + /* arg checks. */ 1581 | + if ((!channel) 1582 | + || (!self_mem_base) 1583 | + || (!peer_mem_base) 1584 | + || (!curr_off)) { 1585 | + ret = -EINVAL; 1586 | + ERR("(%s): Invalid func argurments\n", __func__); 1587 | + goto err; 1588 | + } 1589 | + 1590 | + /* initialise the channel device internals.*/ 1591 | + mutex_init(&(channel->fops_lock)); 1592 | + init_waitqueue_head(&(channel->waitq)); 1593 | + atomic_set(&(channel->db_event), 0); 1594 | + atomic_set(&(channel->link_change_event), 0); 1595 | + 1596 | + /* create a memory which is not exposed to Peer for the internal 1597 | + * counters for flow control logic. This memory shall also be mapped 1598 | + * by user: CTRL_MEM_MMAP. Therefore align to PAGE_SIZE. 1599 | + */ 1600 | + channel->ctrl_mem.size = CH_HDR_SIZE; 1601 | + channel->ctrl_mem.size = PAGE_ALIGN(channel->ctrl_mem.size); 1602 | + channel->ctrl_mem.pva = kzalloc(channel->ctrl_mem.size, GFP_KERNEL); 1603 | + if (!channel->ctrl_mem.pva) { 1604 | + ret = -ENOMEM; 1605 | + ERR("(%s): Failed to allocate priv. mem for control counters\n", 1606 | + channel->name); 1607 | + goto err; 1608 | + } 1609 | + channel->ctrl_mem.phys_addr = virt_to_phys(channel->ctrl_mem.pva); 1610 | + 1611 | + /* calculate channel size: (Flow-Control Header Fields + Frames) 1612 | + * and alignement. 1613 | + */ 1614 | + ch_sz = ((channel->nframes * channel->frame_sz) + CH_HDR_SIZE); 1615 | + ch_sz = ALIGN(ch_sz, channel->align); 1616 | + 1617 | + /* check if we have enough space remaining in PCIe memory for ch_sz. 1618 | + * considering the allocations made for previous channels. 1619 | + */ 1620 | + ret = validate_size(channel, self_mem_base, peer_mem_base, 1621 | + *curr_off, ch_sz); 1622 | + if (ret) { 1623 | + ERR("(%s): Channel size req. cannot fit in PCIe memory\n", 1624 | + channel->name); 1625 | + goto err; 1626 | + } 1627 | + 1628 | + /* assign the offsets within the base memory for this channel. 1629 | + * Since we are not creating kernel virtual mapping for these 1630 | + * memories we only fragment physical addresses. 1631 | + */ 1632 | + channel->tx_mem.aper = peer_mem_base->aper + *curr_off; 1633 | + channel->tx_mem.size = ch_sz; 1634 | + 1635 | + channel->rx_mem.dma_handle = self_mem_base->dma_handle + *curr_off; 1636 | + channel->rx_mem.size = ch_sz; 1637 | + 1638 | + /* debug only.*/ 1639 | + print_channel_info(channel); 1640 | + 1641 | + /* all okay.*/ 1642 | + /* update the size of base mem used by this channel, used for next ch.*/ 1643 | + *curr_off += ch_sz; 1644 | + return ret; 1645 | + 1646 | +err: 1647 | + channel_free(channel); 1648 | + return ret; 1649 | +} 1650 | + 1651 | + 1652 | +/* Free the resources made for the channel. 1653 | + * 1654 | + * Support for channel alloc()->channel free()->channel alloc() isn't there. 1655 | + */ 1656 | +int channel_free(struct channel_t *channel) 1657 | +{ 1658 | + int ret = 0; 1659 | + 1660 | + if (channel) { 1661 | + memset(&(channel->tx_mem), 0x0, sizeof(channel->tx_mem)); 1662 | + memset(&(channel->rx_mem), 0x0, sizeof(channel->tx_mem)); 1663 | + 1664 | + kfree(channel->ctrl_mem.pva); 1665 | + memset(&(channel->ctrl_mem), 0x0, sizeof(channel->tx_mem)); 1666 | + 1667 | + mutex_destroy(&(channel->fops_lock)); 1668 | + 1669 | + /* how to retract running_off to previous value? */ 1670 | + } 1671 | + 1672 | + return ret; 1673 | +} 1674 | + 1675 | +/* 1676 | + * helper function to validate whether the channel size is within the 1677 | + * range of PCIe shared memory/Aperture memory. 1678 | + */ 1679 | +static int 1680 | +validate_size(struct channel_t *channel, 1681 | + struct dma_buff_t *self_mem_base, 1682 | + struct pci_mmio_t *peer_mem_base, 1683 | + size_t offset, size_t ch_sz) 1684 | +{ 1685 | + int ret = 0; 1686 | + 1687 | + /* no validations.*/ 1688 | + 1689 | + if ((offset + ch_sz) > (peer_mem_base->size)) { 1690 | + ret = -ENOMEM; 1691 | + ERR("(%s): channel mem offset beyond pcie aperture\n", 1692 | + channel->name); 1693 | + ERR("offset=(0x%016zx), ch_sz=(0x%016zx)\n", offset, ch_sz); 1694 | + ERR("aper_mem=(%pa[p]), size=(0x%016zx)\n", 1695 | + &(peer_mem_base->aper), peer_mem_base->size); 1696 | + } 1697 | + 1698 | + if ((offset + ch_sz) > (self_mem_base->size)) { 1699 | + ret = -ENOMEM; 1700 | + ERR("(%s): channel mem offset beyond pcie shared mem\n", 1701 | + channel->name); 1702 | + ERR("offset=(0x%016zx), ch_sz=(0x%016zx)\n", offset, ch_sz); 1703 | + ERR("aper_mem=(%pa[d]), size=(0x%016zx)\n", 1704 | + &(self_mem_base->dma_handle), self_mem_base->size); 1705 | + } 1706 | + 1707 | + return ret; 1708 | +} 1709 | + 1710 | +/* DEBUG only. */ 1711 | +static void print_channel_info(struct channel_t *channel) 1712 | +{ 1713 | + DBG("\n"); 1714 | + DBG("(%s) device channel::\n", channel->name); 1715 | + DBG("\t\t alignment = (0x%x)\n", channel->align); 1716 | + DBG("\t\t ctrl hdr size = (0x%x)\n", (CH_HDR_SIZE)); 1717 | + DBG("\t\t nframes=(0x%X) frame_size=(0x%08X)", 1718 | + channel->nframes, channel->frame_sz); 1719 | + DBG("\t\t notification event::\n"); 1720 | + DBG("\t\t\ttype = (%s)\n", (channel->event_type == 0) ? 1721 | + ("NTB doorbell"):("syncpoint")); 1722 | + DBG("\t\t\tprod_id = (0x%02X)\n", channel->prod_event_id); 1723 | + DBG("\t\t\tcons_id = (0x%02X)\n", channel->cons_event_id); 1724 | + DBG("\t\t\tstate_id = (0x%02X)\n", channel->state_event_id); 1725 | + DBG("\t\t tx_aper: (%pa[p]) size:(0x%016zx)\n", 1726 | + &(channel->tx_mem.aper), channel->tx_mem.size); 1727 | + DBG("\t\t rx_aper: (%pa[d]) size:(0x%016zx)\n", 1728 | + &(channel->rx_mem.dma_handle), channel->rx_mem.size); 1729 | + DBG("\t\t ctlr_aper: (0x%016llx) size:(0x%016zx)\n", 1730 | + channel->ctrl_mem.phys_addr, channel->ctrl_mem.size); 1731 | +} 1732 | diff --git a/drivers/misc/nvscic2c/channel.h b/drivers/misc/nvscic2c/channel.h 1733 | new file mode 100644 1734 | index 0000000..40bdd19 1735 | --- /dev/null 1736 | +++ b/drivers/misc/nvscic2c/channel.h 1737 | @@ -0,0 +1,278 @@ 1738 | +/* 1739 | + * Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved. 1740 | + * 1741 | + * This program is free software; you can redistribute it and/or modify it 1742 | + * under the terms and conditions of the GNU General Public License, 1743 | + * version 2, as published by the Free Software Foundation. 1744 | + * 1745 | + * This program is distributed in the hope it will be useful, but WITHOUT 1746 | + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 1747 | + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for 1748 | + * more details. 1749 | + */ 1750 | + 1751 | +/* 1752 | + * Internal: only and only to be included in channel-cdev.c & channel-ops.c 1753 | + * or any other file that is part of nvscic2c channel and channel device 1754 | + * abstraction. This file is not supposed to be included other entities 1755 | + * in nvscic2c module. 1756 | + * 1757 | + * Channel abstraction interfaces to module.c go in chip-to-chip.h. This 1758 | + * file gets included within channel abstraction. 1759 | + */ 1760 | +#ifndef __CHANNEL_H__ 1761 | +#define __CHANNEL_H__ 1762 | + 1763 | + 1764 | +#include "chip-to-chip.h" 1765 | +#include 1766 | +#include 1767 | +#include 1768 | +#include 1769 | +#include 1770 | +#ifdef CONFIG_DEBUG_FS 1771 | +#include 1772 | +#endif 1773 | + 1774 | + 1775 | +/* over-ride these so that: 1776 | + * - we do not instantiate as a platform device. 1777 | + * - we do not use NTB device for spews. 1778 | + * - judiciously use 80 column limit. 1779 | + * - add abstraction within module. 1780 | + */ 1781 | +#define ERR(...) pr_err(MODULE_NAME": channel:\t" __VA_ARGS__) 1782 | +#define INFO(...) pr_info(MODULE_NAME": channel:\t" __VA_ARGS__) 1783 | +#define DBG(...) pr_debug(MODULE_NAME": channel:\t" __VA_ARGS__) 1784 | + 1785 | + 1786 | +/* 1787 | + * Offests for flow-control/state fields in the nvscic2c dev channel 1788 | + * header. All fields are at the moment are 32-bit wide. 1789 | + * 1790 | + * IMPORTANT: If these change from 32-bit to other, change must be 1791 | + * made here too. Also, must be in-sync with user-space SW(same host) and 1792 | + * nvscic2c on remote host. 1793 | + * 1794 | + * These are used for writing/updating channel header fields of 1795 | + * peer host over PCIe by CPU or for reading the same fields that remote 1796 | + * updated on our PCIe shared mem over PCIe or for reading from the 1797 | + * control memory that each channel privately manages and updates. 1798 | + * 1799 | + * (CH_HDR_RESERVED_OFF+4bytes) to start data on 8-byte boundary for better 1800 | + * perf with PCIe Wr. 1801 | + */ 1802 | +#define CH_HDR_TX_CNTR_OFF (0x00) 1803 | +#define CH_HDR_RX_CNTR_OFF (0x04) 1804 | +#define CH_HDR_W_SLEEP_OFF (0x08) 1805 | +#define CH_HDR_R_SLEEP_OFF (0x0C) 1806 | +#define CH_HDR_STATE_OFF (0x10) 1807 | +#define CH_HDR_RESERVED_OFF (0x14) 1808 | +#define CH_DATA_PAYLOAD_OFF (0x18) 1809 | +#define CH_HDR_SIZE (CH_DATA_PAYLOAD_OFF) 1810 | + 1811 | + 1812 | +/* 1813 | + * nvscic2c channel state negotiation protocol. 1814 | + * 1815 | + * Must be in-sync with user-space SW on same host and with remote 1816 | + * nvscic2c host. 1817 | + * 1818 | + * As channel state management is in user-space SW this is required 1819 | + * for poll()/select() implementation, where in if channel is not in 1820 | + * ESTABLISHED state shall return can_read()/can_write() as false. 1821 | + * 1822 | + * For detailed description of these states, refer user-space nvscic2c SW. 1823 | + */ 1824 | +enum channel_state { 1825 | + CH_STATE_ESTABLISHED = 0, 1826 | + CH_STATE_SYNC, 1827 | + CH_STATE_ACK, 1828 | +}; 1829 | + 1830 | + 1831 | +/* 1832 | + * nvscic2c channel data-processing thread state. 1833 | + * 1834 | + * Must be in-sync with user-space SW on same host and with remote 1835 | + * nvscic2c host. 1836 | + * 1837 | + * Used to inidicate remote entity about the xfer/processing status 1838 | + * of the producer/consumer 1839 | + * 1840 | + * Required in LKM only to initialise the corresponding fields in the 1841 | + * channel header with the defaults. For detailed description of these states, 1842 | + * refer user-space nvscic2c SW. 1843 | + */ 1844 | +enum channel_xfer_state { 1845 | + CH_XFER_RUNNING = 0, 1846 | + CH_XFER_WAITING, 1847 | + CH_XFER_INVALID, 1848 | +}; 1849 | + 1850 | + 1851 | +/* 1852 | + * Masked offsets to return to user, allowing them to mmap 1853 | + * different memory segments of channel in user-space. 1854 | + */ 1855 | +enum mem_mmap_type { 1856 | + /* Invalid.*/ 1857 | + MEM_MMAP_INVALID = 0, 1858 | + /* Map Peer PCIe aperture: For Tx across PCIe.*/ 1859 | + PEER_MEM_MMAP, 1860 | + /* Map Self PCIe shared memory: For Rx across PCIe.*/ 1861 | + SELF_MEM_MMAP, 1862 | + /* Map Self memory(not exposed via PCIe).*/ 1863 | + CTRL_MEM_MMAP, 1864 | + /* Map Link memory segment to query link status with Peer.*/ 1865 | + LINK_MEM_MMAP, 1866 | + /* Maximum. */ 1867 | + MEM_MAX_MMAP, 1868 | +}; 1869 | + 1870 | + 1871 | +/* private data structure for every channel device. */ 1872 | +struct channel_t { 1873 | + /* properties / attributes of this channel.*/ 1874 | + char name[MAX_NAME_LEN]; 1875 | + 1876 | + /* notification for the peer - data and state.*/ 1877 | + uint8_t event_type; 1878 | + uint8_t prod_event_id; 1879 | + uint8_t cons_event_id; 1880 | + uint8_t state_event_id; 1881 | + 1882 | + /* slot/frames this channel is divided into honoring alignment.*/ 1883 | + uint16_t nframes; 1884 | + uint32_t frame_sz; 1885 | + uint16_t align; 1886 | + 1887 | + /* flow-control/channel state information updated by local for 1888 | + * remote to read/refer over PCIe. Write-Only by Self host. 1889 | + * Mapped to user-space: TX_MEM_MMAP. 1890 | + */ 1891 | + struct pci_mmio_t tx_mem; 1892 | + 1893 | + /* flow-control/channel state information as updated by remote 1894 | + * host over PCIe. Read-Only by Self host. Exposed to Peer as 1895 | + * PCIe shared memory. Mapped to user-space: RX_MEM_MMAP. 1896 | + */ 1897 | + struct dma_buff_t rx_mem; 1898 | + 1899 | + /* flow-control/channel state information as updated by self locally. 1900 | + * Not exposed to Peer host. Mapped to user-space: CTRL_MEM_MMAP. 1901 | + */ 1902 | + struct cpu_buff_t ctrl_mem; 1903 | + 1904 | + /* channel is a bulk data xfer channel? if so, then direction.*/ 1905 | + enum bulk_xfer_type bulk_xfer_mode; 1906 | + 1907 | + /* device management.*/ 1908 | + int minor; 1909 | + dev_t dev; 1910 | + struct cdev cdev; 1911 | + struct device *device; 1912 | + 1913 | + /* poll/notifications.*/ 1914 | + wait_queue_head_t waitq; 1915 | + 1916 | + /* serialise access to fops.*/ 1917 | + struct mutex fops_lock; 1918 | + bool in_use; 1919 | + 1920 | + /* book-keeping of channel doorbell events.*/ 1921 | + atomic_t db_event; 1922 | + 1923 | + /* book-keeping of channel state change db events.*/ 1924 | + atomic_t link_change_event; 1925 | + 1926 | +#ifdef CONFIG_DEBUG_FS 1927 | + struct dentry *dbgfs_root; 1928 | + off_t dbgfs_iteration; 1929 | +#endif 1930 | +}; 1931 | + 1932 | + 1933 | +/* 1934 | + * Overall context for the channel sub-module of nvscic2c module. 1935 | + * This is to meet the expectation of channel abstraction. 1936 | + */ 1937 | +struct channel_drv_ctx_t { 1938 | + /* entire char device region allocated for all channels.*/ 1939 | + dev_t char_dev; 1940 | + 1941 | + /* every channel device will be registered to this class.*/ 1942 | + struct class *class; 1943 | + 1944 | + /* array of nvscic2c channel devices.*/ 1945 | + int8_t channel_nr; 1946 | + struct channel_t *channels; 1947 | + 1948 | + /* NTB db <-> Channel association. 1949 | + * To route the notification from peer to the right channel. 1950 | + * db vectors: [0, max supported-1]. 1951 | + */ 1952 | + int32_t db_vec_nr; 1953 | + struct channel_t **db_ch_tbl; 1954 | + struct mutex db_ch_tbl_lock; 1955 | + 1956 | + /* Receive area: PCIe shared mem. Peer's Rd/Wr reflect here. */ 1957 | + struct dma_buff_t self_mem_base; 1958 | + 1959 | + /* Transmit area: PCIe aperture. Self's Rd/Wr to Peer go via this.*/ 1960 | + struct pci_mmio_t peer_mem_base; 1961 | + 1962 | + /* offset to fragment the self and peer base memory into channels. 1963 | + * every channel adds its channel size requirement to this and is 1964 | + * then used to assign offset to next channel. 1965 | + */ 1966 | + off_t running_off; 1967 | +}; 1968 | + 1969 | + 1970 | +/* helper function to validate the nvscic2c channel parameters 1971 | + * as parsed from config.c or DT. 1972 | + */ 1973 | +int validate_channel_params(struct channel_t *channel); 1974 | + 1975 | + 1976 | +/* 1977 | + * Allocate and initialise each nvscic2c channel internals with 3 different 1978 | + * memories: Tx(PCIe Aperture), Rx (PCIe Shared Mem) and CtrlMem for maintaining 1979 | + * control flow information not exposed to Peer. 1980 | + * 1981 | + * Fragment the nvscic2c memory: peer and self into channels starting 1982 | + * at provided offset. Also, update the offset with the channel size 1983 | + * requirements to be able to start next channel from there. 1984 | + * 1985 | + * Not thread-safe. 1986 | + */ 1987 | +int channel_alloc(struct channel_t *channel, 1988 | + struct dma_buff_t *self_mem_base, 1989 | + struct pci_mmio_t *peer_mem_base, 1990 | + off_t *curr_off); 1991 | + 1992 | +/* Free the resources made for the channel. 1993 | + * 1994 | + * Support for channel alloc()->channel free()->channel alloc() isn't there. 1995 | + */ 1996 | +int channel_free(struct channel_t *channel); 1997 | + 1998 | +#ifdef CONFIG_DEBUG_FS 1999 | +/* 2000 | + * Helper API to create the Debugfs interface for each nvscic2c 2001 | + * channel device and enable MANUAL VERFICATION of read/write from 2002 | + * user-space to LKM and vice-versa for same memory: Tx(Peer), Self(Rx), 2003 | + * Ctrl and Link. 2004 | + * 2005 | + * purely for debugging purpose. 2006 | + */ 2007 | +int channel_dbgfs_create(struct channel_t *channel); 2008 | + 2009 | +/* 2010 | + * to clean up the debugfs interface. 2011 | + */ 2012 | +void channel_dbgfs_remove(struct channel_t *channel); 2013 | +#endif //CONFIG_DEBUG_FS 2014 | + 2015 | +#endif // __CHANNEL__ 2016 | diff --git a/drivers/misc/nvscic2c/chip-to-chip.h b/drivers/misc/nvscic2c/chip-to-chip.h 2017 | new file mode 100644 2018 | index 0000000..62d5266 2019 | --- /dev/null 2020 | +++ b/drivers/misc/nvscic2c/chip-to-chip.h 2021 | @@ -0,0 +1,409 @@ 2022 | +/* 2023 | + * Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved. 2024 | + * 2025 | + * This program is free software; you can redistribute it and/or modify it 2026 | + * under the terms and conditions of the GNU General Public License, 2027 | + * version 2, as published by the Free Software Foundation. 2028 | + * 2029 | + * This program is distributed in the hope it will be useful, but WITHOUT 2030 | + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 2031 | + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for 2032 | + * more details. 2033 | + */ 2034 | + 2035 | +/* 2036 | + * Internal to nvscic2c module. This file is not supposed to be included 2037 | + * by any other external modules. 2038 | + */ 2039 | +#ifndef __CHIP_TO_CHIP_H__ 2040 | +#define __CHIP_TO_CHIP_H__ 2041 | + 2042 | + 2043 | +#include 2044 | +#include 2045 | +#include 2046 | + 2047 | + 2048 | +/* Name of our module: used in all the files, channel device prefix.*/ 2049 | +#define MODULE_NAME "nvscic2c" 2050 | + 2051 | + 2052 | +/* Maximum length of any string used - channel name, thread name, etc. */ 2053 | +#define MAX_NAME_LEN (32) 2054 | + 2055 | + 2056 | +/* some channels may not have all the DBs that we have in our data-structures, 2057 | + * for e.g Bulk transfer channels have two DBs rather than three. 2058 | + * use this MACRO to differentiate when somebody wants a NIL DB(=NO DB) 2059 | + */ 2060 | +#define DB_ID_NIL (0xFF) 2061 | + 2062 | + 2063 | +/* PCIe aperture memory type for Tx/Rx Peer via BAR. */ 2064 | +struct pci_mmio_t { 2065 | + /* Physical Pcie aperture - BAR aperture.*/ 2066 | + phys_addr_t aper; 2067 | + 2068 | + /* PVA for the BAR aperture.*/ 2069 | + void __iomem *pva; 2070 | + 2071 | + /* size of the BAR aperture.*/ 2072 | + size_t size; 2073 | +}; 2074 | + 2075 | + 2076 | +/* PCIe Shared memory registered/exported to peer. 2077 | + * Either pointing to reserved fixed address or memory 2078 | + * allocated by dma_buf API. 2079 | + */ 2080 | +struct dma_buff_t { 2081 | + /* local VA for CPU access. */ 2082 | + void *pva; 2083 | + 2084 | + /* iova(iommu=ON) or bus address/physical address. */ 2085 | + dma_addr_t dma_handle; 2086 | + 2087 | + /* size of the memory allocated. */ 2088 | + size_t size; 2089 | +}; 2090 | + 2091 | + 2092 | +/* CPU-only accessible memory which is not PCIe aper or PCIe 2093 | + * shared memory. Typically will contain information of memory 2094 | + * allocated via kalloc()/kzalloc*(). 2095 | + */ 2096 | +struct cpu_buff_t { 2097 | + /* cpu address(va). */ 2098 | + void *pva; 2099 | + 2100 | + /* physical address. */ 2101 | + uint64_t phys_addr; 2102 | + 2103 | + /* size of the memory allocated. */ 2104 | + size_t size; 2105 | +}; 2106 | + 2107 | +/* 2108 | + * Parameters that each c2c channel shall be configured from. These parameters 2109 | + * are either populated from DT file(aarch64) or via config file(x86_64). 2110 | + * Applies to both CPU and Bulk transfer channels. 2111 | + * 2112 | + * These are read-only for the rest of the nvscic2c module. 2113 | + * 2114 | + * This serves as an abstraction, nvscic2c module uses this to setup the c2c 2115 | + * channels without worrying whether they come from DT or a config file. 2116 | + */ 2117 | +struct channel_param_t { 2118 | + /* Id for a channel, picked up from config.c.*/ 2119 | + uint8_t ch_id; 2120 | + 2121 | + /* human readable name assigned to a c2c channel. - Debug only. */ 2122 | + char ch_name[MAX_NAME_LEN]; 2123 | + 2124 | + /* Configuration for event notification for a c2c channel, as read 2125 | + * from DT or configuration file. 2126 | + * - event notification type: NTB doorbell or Propietary, 2127 | + * - event notification ID for proxy producer: wait & trigger. 2128 | + * - event notification ID for proxy consumer: wait & trigger. 2129 | + * - event notification ID for state management: wait & trigger. 2130 | + */ 2131 | + uint8_t event_type; 2132 | + uint8_t prod_event_id; 2133 | + uint8_t cons_event_id; 2134 | + uint8_t state_event_id; 2135 | + 2136 | + /* every c2c channel is fragmented into slots/frames that self 2137 | + * can read what remote has written into. Both for Tx and Rx memory 2138 | + * in identical way. Alignment to be honored. 2139 | + */ 2140 | + int16_t nframes; 2141 | + uint32_t frame_sz; 2142 | + int16_t align; 2143 | + 2144 | + /* channel is a bulk data xfer channel? if so, then direction.*/ 2145 | + enum bulk_xfer_type bulk_xfer_mode; 2146 | +}; 2147 | + 2148 | + 2149 | +/* 2150 | + * Configurable parameters for the nvscic2c module. These contain the 2151 | + * c2c channel parameters also but along with them global parameters which 2152 | + * are configurable for c2c module. 2153 | + * 2154 | + * These are read-only for the rest of the nvscic2c module. 2155 | + * 2156 | + * This serves as an abstraction, nvscic2c module uses this to setup the c2c 2157 | + * channels without worrying whether they come from DT or a config file. 2158 | + * 2159 | + * All of these (and nested) configuration parameters MUST BE IN SYNC WITH 2160 | + * REMOTE HOST. 2161 | + */ 2162 | +struct c2c_param_t { 2163 | + /* minimum PCIe shared memory window - configurable as required 2164 | + * for supporting multiple use-cases. - As read from DT module 2165 | + * or configuration module. 2166 | + */ 2167 | + size_t req_mw_sz; 2168 | + 2169 | + /* fixed physical address to be used for setting up PCIe shared mem. 2170 | + * Parameter given while loading module shall over-ride the same 2171 | + * setting provided in config.c file. 2172 | + */ 2173 | + uint64_t fixed_mw_addr; 2174 | + size_t fixed_mw_sz; 2175 | + bool use_fixed_addr; 2176 | + 2177 | + /* all enabled c2c channel configuration - CPU or Bulk.*/ 2178 | + uint8_t channel_nr; 2179 | + struct channel_param_t *ch_params; 2180 | +}; 2181 | + 2182 | + 2183 | +/* 2184 | + * defines the nvscic2c module driver context. Shall contain all the 2185 | + * c2c channels: configuration, char devices, link management thread, 2186 | + * base memory allocated for Rx(PCIe shared memory) and mapped for Tx 2187 | + * memory (PCIe aperture). 2188 | + */ 2189 | +struct c2c_drv_ctx_t { 2190 | + /* the configuration for module and individual channels.*/ 2191 | + struct c2c_param_t c2c_param; 2192 | + 2193 | + /* Receive area: PCIe shared mem. Peer's Rd/Wr reflect here. */ 2194 | + struct dma_buff_t self_mem; 2195 | + 2196 | + /* Transmit area: PCIe aperture. Self's Rd/Wr to Peer go via this.*/ 2197 | + struct pci_mmio_t peer_mem; 2198 | +}; 2199 | + 2200 | + 2201 | +/* 2202 | + * called by nvscic2c module to parse the static configuration we have 2203 | + * set above. On aarch64 this is DT parsing, but on x86_64 we do not want 2204 | + * to use DT mechanism, parse the statically defined configuration above 2205 | + * into nvscic2c module c2c_param_t structure. 2206 | + */ 2207 | +int config_parse(struct c2c_param_t *c2c_param); 2208 | + 2209 | + 2210 | +/* 2211 | + * called by nvscic2c module to free up the memory allocated during 2212 | + * config_parse(). Like for a like replacement for dt_release() on 2213 | + * x86_64. 2214 | + */ 2215 | +int config_release(struct c2c_param_t *c2c_param); 2216 | + 2217 | + 2218 | +/* 2219 | + * Interface for nvscic2c driver to register itself as a NTB client driver. 2220 | + * 2221 | + * Because, we expect nvscic2c window size requirements to be fulfilled 2222 | + * by PCIe NTB share memory, this function should be called after successful 2223 | + * parsing of config.c 2224 | + * 2225 | + * THIS INTERNALLY WAITS FOR NTB PROBE TO COMPLETE. IF NTB DRIVER WASN'T LOADED 2226 | + * IT WOULD TIMEOUT AND RETURN FAILURE. 2227 | + */ 2228 | +int ntb_client_register(struct c2c_drv_ctx_t *drv_ctx); 2229 | + 2230 | + 2231 | +/* 2232 | + * Interface for nvscic2c driver to unload the NTB client driver. 2233 | + * 2234 | + * As a result the NTB link between two SoC's would go DOWN. 2235 | + */ 2236 | +void ntb_client_unregister(struct c2c_drv_ctx_t *drv_ctx); 2237 | + 2238 | + 2239 | +/* 2240 | + * NTB client driver has a private context, query the pcie shared memory 2241 | + * with the caller 2242 | + * 2243 | + * Can be called once NTB client is registered with NTB properly. 2244 | + */ 2245 | +int ntb_client_query_mem_info(struct dma_buff_t *self_mem, 2246 | + struct pci_mmio_t *peer_mem); 2247 | + 2248 | +/* 2249 | + * This function allows nvscic2c driver to set the link state as 2250 | + * UP(true) or DOWN(false) when the NTB client driver was registered properly. 2251 | + * 2252 | + * Expected use is, to set the link to TRUE when nvscic2c is done setup of 2253 | + * all c2c channels and is ready to exchange data. 2254 | + * 2255 | + * true: link up, false otherwise. 2256 | + */ 2257 | +int ntb_client_set_link_status(enum link_status status); 2258 | + 2259 | + 2260 | +/* 2261 | + * Wrapper over corresponding NTB api, added to maintain abstraction: 2262 | + * channel functionality doesn't call directly into NTB apis 2263 | + * but only via ntb-client. 2264 | + */ 2265 | +int ntb_client_db_vector_count(void); 2266 | + 2267 | + 2268 | +/* 2269 | + * Wrapper over corresponding NTB api, added to maintain abstraction: 2270 | + * channel functionality doesn't call directly into NTB apis 2271 | + * but only via ntb-client. 2272 | + */ 2273 | +int ntb_client_db_set_mask(uint64_t db_bits); 2274 | + 2275 | + 2276 | +/* 2277 | + * Wrapper over corresponding NTB api, added to maintain abstraction: 2278 | + * channel functionality doesn't call directly into NTB apis 2279 | + * but only via ntb-client. 2280 | + */ 2281 | +int ntb_client_db_clear_mask(uint64_t db_bits); 2282 | + 2283 | +/* 2284 | + * Wrapper over corresponding NTB api, added to maintain abstraction: 2285 | + * channel functionality doesn't call directly into NTB apis 2286 | + * but only via ntb-client. 2287 | + */ 2288 | +int ntb_client_db_clear(uint64_t db_bits); 2289 | + 2290 | +/* 2291 | + * Wrapper over corresponding NTB api, added to maintain abstraction: 2292 | + * channel functionality doesn't call directly into NTB apis 2293 | + * but only via ntb-client. 2294 | + */ 2295 | +int ntb_client_peer_db_set(uint64_t db_bits); 2296 | + 2297 | +/* 2298 | + * Export nvscic2c(channel-cdev.c) dev node portion of self memory: 2299 | + * Rx memory to userspace via mmap() call. Reason for this implementation 2300 | + * being, if PCIe shared mem was allocated using dma-buff apis, channel 2301 | + * abstraction would not have ntbdev to do mmap for dma buffer. Also, 2302 | + * channel doesn't if fixed address(iommu=off) is being used. 2303 | + * 2304 | + * If nvscic2c module has called link_mgmt_release(), it must not 2305 | + * refer link_status_mem anytime after that. 2306 | + */ 2307 | +int ntb_client_mmap_self_mem(struct vm_area_struct *vma, 2308 | + struct dma_buff_t *self_mem); 2309 | + 2310 | +/* 2311 | + * Entry point for the nvscic2c channel char device sub-module/abstraction. 2312 | + * 2313 | + * On successful return (0), devices would have been created and ready to 2314 | + * accept ioctls from user-space application. 2315 | + * 2316 | + * Mapping of each NTB doorbell to a C2C channel is also maintained here. 2317 | + * 2318 | + * We must come here after setting up the NTB client with PCIe shared memory 2319 | + * (Self) and PCIe aperture(Peer) available. 2320 | + */ 2321 | +int channel_setup_devices(struct c2c_drv_ctx_t *drv_ctx); 2322 | + 2323 | + 2324 | +/* exit point for nvscic2c channel char device sub-module/abstraction.*/ 2325 | +int channel_release_devices(struct c2c_drv_ctx_t *drv_ctx); 2326 | + 2327 | + 2328 | +/* 2329 | + * Function called by nvscic2c module.c on seeing a change in the 2330 | + * NTB link status. Here we pass on this event to each channel 2331 | + * which is required for their poll() implementation. 2332 | + * 2333 | + * This is supposed to be called only change in link status not 2334 | + * for every NTB link event(hb). 2335 | + */ 2336 | +int channel_link_event(enum link_status status); 2337 | + 2338 | + 2339 | +/* 2340 | + * Function called by NTB client(ntb-client) on getting a NTB DB 2341 | + * event. 2342 | + * 2343 | + * We receive the DB vector/index which triggered this event. 2344 | + * We internally go through the channel and db association and 2345 | + * pass on the db event to relevant channel. 2346 | + * 2347 | + * A channel will/may have multiple DB's but we should be get only 2348 | + * 1 DB per event callback. 2349 | + */ 2350 | +int channel_db_event(int db_idx); 2351 | + 2352 | + 2353 | +/* 2354 | + * Entry point for the link management sub-module/abstraction. 2355 | + * 2356 | + * Link mgmt abstraction keeps track of previous link state. If this 2357 | + * changes to a different state than previous state, we invoke a callback 2358 | + * nvscic2c module(module.c) registers with link_mgmt to notify all channels 2359 | + * for link status change. 2360 | + * 2361 | + * On successful return (0), link monitoring thread would sends the first link 2362 | + * UP event to remote and then waits for Link UP heart beats. 2363 | + * 2364 | + * THIS ENABLES THE NTB LINK, hence must be called when all setup is done. 2365 | + * 2366 | + * Also allocates the link status memory which nvscic2c dev exports to user- 2367 | + * space via mmap. 2368 | + * 2369 | + * CAVEAT: nvscic2c dev nodes export mmap which shall map link status memory 2370 | + * to user-space. We would start this link_mgmt once all devices are created 2371 | + * but we also allocate this status memory when starting link_mgmt thread. 2372 | + * Hence, there would be a small window where the nvscic2c devices have been 2373 | + * setup and mmap is exported but link_mgmt module may not be ready yet. 2374 | + */ 2375 | +/* callback options to register with link mgmt module. */ 2376 | +struct link_mgmt_ops { 2377 | + /* a callback invoked on a change in link status.*/ 2378 | + void (*link_status_changed)(enum link_status status, 2379 | + void *ctx); 2380 | + /* context that will link_mgmt should return along with cb.*/ 2381 | + void *ctx; 2382 | +}; 2383 | +int link_mgmt_start(struct link_mgmt_ops *ops); 2384 | + 2385 | + 2386 | +/* exit point for link mgmt sub-module/abstraction.*/ 2387 | +int link_mgmt_stop(void); 2388 | + 2389 | + 2390 | +/* 2391 | + * used by channel abstraction to query the current link status. 2392 | + * for poll() implementation. 2393 | + * 2394 | + * If nvscic2c module has called link_mgmt_release(), it must not 2395 | + * refer link_status_mem anytime after that. 2396 | + */ 2397 | +enum link_status link_mgmt_get_link_status(void); 2398 | + 2399 | + 2400 | +/* 2401 | + * used by channel abstraction to query the size of the link status 2402 | + * memory that shall be exported to userspace SW. for ioctl()/mmap() 2403 | + * implementation. 2404 | + * 2405 | + * If nvscic2c module has called link_mgmt_release(), it must not 2406 | + * refer link_status_mem anytime after that. 2407 | + */ 2408 | +size_t link_mgmt_get_status_mem_size(void); 2409 | + 2410 | + 2411 | +/* 2412 | + * helper function to mmap the link status memory for an nvscic2c 2413 | + * dev node. This is required because the channel abstraction doesn't 2414 | + * keep the link status memory credentials with itself. 2415 | + * 2416 | + * If nvscic2c module has called link_mgmt_release(), it must not 2417 | + * refer link_status_mem anytime after that. 2418 | + */ 2419 | +int link_mgmt_mmap_status_mem(struct vm_area_struct *vma); 2420 | + 2421 | + 2422 | +/* 2423 | + * Registered with NTB client abstraction(ntb-client.c). On every link 2424 | + * event received by peer for NTB link, this callback shall be invoked. 2425 | + * 2426 | + * The status received in this callback is fed to monitor thread of 2427 | + * link mgmt thread to decide change in link status. 2428 | + */ 2429 | +void link_mgmt_event_cb(enum link_status status); 2430 | +#endif //__CHIP_TO_CHIP_H__ 2431 | diff --git a/drivers/misc/nvscic2c/config.c b/drivers/misc/nvscic2c/config.c 2432 | new file mode 100644 2433 | index 0000000..78e31e7 2434 | --- /dev/null 2435 | +++ b/drivers/misc/nvscic2c/config.c 2436 | @@ -0,0 +1,394 @@ 2437 | +/* 2438 | + * Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved. 2439 | + * 2440 | + * This program is free software; you can redistribute it and/or modify it 2441 | + * under the terms and conditions of the GNU General Public License, 2442 | + * version 2, as published by the Free Software Foundation. 2443 | + * 2444 | + * This program is distributed in the hope it will be useful, but WITHOUT 2445 | + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 2446 | + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for 2447 | + * more details. 2448 | + */ 2449 | + 2450 | +#include "chip-to-chip.h" 2451 | +#include 2452 | +#include 2453 | +#include 2454 | +#include 2455 | + 2456 | + 2457 | +/* over-ride these so that: 2458 | + * - we do not instantiate as a platform device. 2459 | + * - we do not use NTB device for spews. 2460 | + * - judiciously use 80 column limit. 2461 | + * - add abstraction within module. 2462 | + */ 2463 | +#define ERR(...) pr_err(MODULE_NAME": config:\t" __VA_ARGS__) 2464 | +#define INFO(...) pr_info(MODULE_NAME": config:\t" __VA_ARGS__) 2465 | +#define DBG(...) pr_debug(MODULE_NAME": config:\t" __VA_ARGS__) 2466 | + 2467 | + 2468 | +/* Maximum c2c channels supported.*/ 2469 | +#define MAX_CHANNELS (16) 2470 | + 2471 | + 2472 | +/* 2473 | + * Frames/Slots in which the C2C channel Tx and Rx memory is fragemented 2474 | + * into. Must be in-sync with the remote host for the same given channel. 2475 | + * Applicable to both CPU and Bulk channels. 2476 | + */ 2477 | +struct frame_cfg_t { 2478 | + /* frames/slot count.*/ 2479 | + uint16_t nframes; 2480 | + 2481 | + /* size of each frame/slot. */ 2482 | + uint32_t frame_sz; 2483 | +}; 2484 | + 2485 | + 2486 | +/* 2487 | + * nevent configuration per c2c channel (channels wait for / trigger 2488 | + * MSI-X using nevent abstraction. This data-type allows user to 2489 | + * specify the nevent IDs used for a given channel. Must be in-sync 2490 | + * with remote host for a given channel. 2491 | + */ 2492 | +struct nevent_cfg_t { 2493 | + /* Notification type: NTB doorbell or Propietary. Unused for x86_64.*/ 2494 | + uint8_t type; 2495 | + 2496 | + /* Used by proxy-prod thread to wait for/trigger MSI-X notification 2497 | + * from/for proxy-cons. 2498 | + */ 2499 | + uint8_t id_1; 2500 | + 2501 | + /* Used by proxy-cons thread to wait for/trigger MSI-X notification 2502 | + * from/for proxy-prod. 2503 | + */ 2504 | + uint8_t id_2; 2505 | + 2506 | + /* Used by state management thread for c2c channel state mgmt.*/ 2507 | + uint8_t id_3; 2508 | +}; 2509 | + 2510 | + 2511 | +/* 2512 | + * C2C channel configuration as entered here in config.h. 2513 | + * Shall be read-only for other files in the nvscic2c module. 2514 | + */ 2515 | +struct channel_cfg_t { 2516 | + /* is channel enabled to be used.*/ 2517 | + bool enable; 2518 | + 2519 | + /* alignment for this channel, must be multiple of 4K (PAGE_SIZE). */ 2520 | + uint16_t align; 2521 | + 2522 | + /* PCIe shared memory fragmentation/slots.*/ 2523 | + struct frame_cfg_t frames; 2524 | + 2525 | + /* nevent configuration.*/ 2526 | + struct nevent_cfg_t nevent; 2527 | + 2528 | + /* if it's bulk data xfer channel, then what role - prod or cons.*/ 2529 | + enum bulk_xfer_type bulk_xfer_mode; 2530 | +}; 2531 | + 2532 | + 2533 | +/* 2534 | + * Editable/configurable C2C module configuration by user. Any change here shall 2535 | + * reflect only after re-compilation re-flashing of the newly built nvscic2c.ko 2536 | + * module. 2537 | + * 2538 | + * The static initialisation below must be in-sync with remote nvscic2c host 2539 | + * or must complement remote nvscic2c host for bulk xfer channel role. 2540 | + * 2541 | + * THIS IS REPLACEMENT FOR DEVICE TREE FOR AARCH64 PLATFORM. 2542 | + */ 2543 | +static struct c2c_cfg_t { 2544 | + /* size of the mem window we shall expose to remote host in BAR#4. 2545 | + * This is proportional to nvscic2c channels we support. 2546 | + */ 2547 | + size_t req_mw_sz; 2548 | + 2549 | + /* fixed physical address to be used for setting up PCIe shared mem. 2550 | + * ASSUMING THIS WAY IS WHEN PLATFORM DOESN'T SUPPORT IOMMU. 2551 | + */ 2552 | + uint64_t fixed_mw_addr; 2553 | + 2554 | + /* it's size. 2555 | + * ASSUMING THIS WAY IS WHEN PLATFORM DOESN'T SUPPORT IOMMU. 2556 | + */ 2557 | + size_t fixed_mw_sz; 2558 | + 2559 | + /* settings for all the supported C2C channels. - bulk or CPU.*/ 2560 | + struct channel_cfg_t channels[MAX_CHANNELS]; 2561 | +} c2c_config = { 2562 | + /* Expected memory window size we expect to work with. This will 2563 | + * be the size of PCIe shared memory exposed to remote. Must be 2564 | + * in-sync with remote host. Proportional to channels we support. 2565 | + */ 2566 | + .req_mw_sz = 0x10000000, 2567 | + 2568 | + /* change this to reflect the physical memory reserved for mw. 2569 | + * This would get over-ridden by similar module parameter. 2570 | + * if provided. 2571 | + */ 2572 | + .fixed_mw_addr = 0x0, 2573 | + .fixed_mw_sz = 0x0, 2574 | + 2575 | + /* all the supported channels with this host. Again must be 2576 | + * in-sync with remote host. 2577 | + */ 2578 | + .channels = { 2579 | + /* CPU channel. RemoteHost <-> x86. */ 2580 | + [0] = { 2581 | + .enable = true, 2582 | + .align = 0x1000, 2583 | + .frames = { 0x40, 0x600 }, 2584 | + .nevent = { 0x00, 0x00, 0x01, 0x02 }, 2585 | + }, 2586 | + 2587 | + /* Bulk transfer channel RemoteHost -> x86. */ 2588 | + [1] = { 2589 | + .enable = true, 2590 | + .align = 0x1000, 2591 | + .frames = { 0x03, 0x400000 }, 2592 | + .nevent = { 0x0, 0x03, DB_ID_NIL, 0x04 }, 2593 | + .bulk_xfer_mode = BULK_XFER_TYPE_PRODUCER, 2594 | + }, 2595 | + 2596 | + /* Bulk transfer channel RemoteHost <- x86. */ 2597 | + [2] = { 2598 | + .enable = true, 2599 | + .align = 0x1000, 2600 | + .frames = { 0x03, 0x400000 }, 2601 | + .nevent = { 0x0, DB_ID_NIL, 0x05, 0x06 }, 2602 | + .bulk_xfer_mode = BULK_XFER_TYPE_CONSUMER, 2603 | + }, 2604 | + 2605 | + /* Bulk transfer channel RemoteHost -> x86. */ 2606 | + [3] = { 2607 | + .enable = true, 2608 | + .align = 0x1000, 2609 | + .frames = { 0x03, 0x400000 }, 2610 | + .nevent = { 0x0, 0x07, DB_ID_NIL, 0x08 }, 2611 | + .bulk_xfer_mode = BULK_XFER_TYPE_PRODUCER_PCIE_READ, 2612 | + }, 2613 | + 2614 | + /* Bulk transfer channel RemoteHost <- x86. */ 2615 | + [4] = { 2616 | + .enable = true, 2617 | + .align = 0x1000, 2618 | + .frames = { 0x03, 0x400000 }, 2619 | + .nevent = { 0x0, DB_ID_NIL, 0x09, 0x0A }, 2620 | + .bulk_xfer_mode = BULK_XFER_TYPE_CONSUMER_PCIE_READ, 2621 | + }, 2622 | + 2623 | + /* Bulk transfer channel RemoteHost -> x86. */ 2624 | + [5] = { 2625 | + .enable = true, 2626 | + .align = 0x1000, 2627 | + .frames = { 0x02, 0xF00000 }, 2628 | + .nevent = { 0x0, 0x0B, DB_ID_NIL, 0x0C }, 2629 | + .bulk_xfer_mode = BULK_XFER_TYPE_PRODUCER, 2630 | + }, 2631 | + 2632 | + /* Bulk transfer channel RemoteHost -> x86. */ 2633 | + [6] = { 2634 | + .enable = true, 2635 | + .align = 0x1000, 2636 | + .frames = { 0x02, 0xF00000 }, 2637 | + .nevent = { 0x0, 0x0D, DB_ID_NIL, 0x0E }, 2638 | + .bulk_xfer_mode = BULK_XFER_TYPE_PRODUCER, 2639 | + }, 2640 | + 2641 | + /* Bulk transfer channel RemoteHost -> x86. */ 2642 | + [7] = { 2643 | + .enable = true, 2644 | + .align = 0x1000, 2645 | + .frames = { 0x02, 0xF00000 }, 2646 | + .nevent = { 0x0, 0x0F, DB_ID_NIL, 0x10 }, 2647 | + .bulk_xfer_mode = BULK_XFER_TYPE_PRODUCER, 2648 | + }, 2649 | + 2650 | + /* Bulk transfer channel RemoteHost -> x86. */ 2651 | + [8] = { 2652 | + .enable = true, 2653 | + .align = 0x1000, 2654 | + .frames = { 0x02, 0xF00000 }, 2655 | + .nevent = { 0x0, 0x011, DB_ID_NIL, 0x12 }, 2656 | + .bulk_xfer_mode = BULK_XFER_TYPE_PRODUCER, 2657 | + }, 2658 | + 2659 | + /* Bulk transfer channel RemoteHost <- x86. */ 2660 | + [9] = { 2661 | + .enable = true, 2662 | + .align = 0x1000, 2663 | + .frames = { 0x02, 0x2800000 }, 2664 | + .nevent = { 0x0, DB_ID_NIL, 0x13, 0x14 }, 2665 | + .bulk_xfer_mode = BULK_XFER_TYPE_CONSUMER_PCIE_READ, 2666 | + }, 2667 | + }, 2668 | +}; 2669 | + 2670 | +/* prototype.*/ 2671 | +static void config_print(struct c2c_param_t *c2c_param); 2672 | + 2673 | + 2674 | +/* 2675 | + * called by nvscic2c module to free up the memory allocated during 2676 | + * config_parse(). Like for a like replacement for dt_release() on 2677 | + * x86_64. 2678 | + */ 2679 | +int config_release(struct c2c_param_t *c2c_param) 2680 | +{ 2681 | + int ret = 0; 2682 | + 2683 | + if (c2c_param == NULL) 2684 | + return ret; 2685 | + 2686 | + kfree(c2c_param->ch_params); 2687 | + c2c_param->ch_params = NULL; 2688 | + 2689 | + return ret; 2690 | +} 2691 | + 2692 | + 2693 | +/* 2694 | + * called by nvscic2c module to parse the static configuration we have 2695 | + * set above. On aarch64 this is DT parsing, but on x86_64 we do not want 2696 | + * to use DT mechanism, parse the statically defined configuration above 2697 | + * into nvscic2c module c2c_param_t structure. 2698 | + */ 2699 | +int config_parse(struct c2c_param_t *c2c_param) 2700 | +{ 2701 | + uint8_t i = 0, j = 0; 2702 | + int ret = 0; 2703 | + 2704 | + /* validation. */ 2705 | + if (c2c_param == NULL) { 2706 | + ERR("(%s): Invalid Param\n", __func__); 2707 | + ret = -EINVAL; 2708 | + goto err; 2709 | + } 2710 | + 2711 | + /* start by allocating space for max channels supported.*/ 2712 | + c2c_param->ch_params = kzalloc((sizeof(*c2c_param->ch_params) 2713 | + * MAX_CHANNELS), 2714 | + GFP_KERNEL); 2715 | + if (c2c_param->ch_params == NULL) { 2716 | + ret = -ENOMEM; 2717 | + ERR("Failed to allocate driver ctx\n"); 2718 | + goto err; 2719 | + } 2720 | + 2721 | + /* loop through all statically defined channels and populate 2722 | + * c2c_param with enabled channel information. 2723 | + */ 2724 | + for (i = 0, j = 0; i < MAX_CHANNELS; i++) { 2725 | + struct channel_param_t *param = NULL; 2726 | + struct channel_cfg_t *cfg = NULL; 2727 | + 2728 | + cfg = &(c2c_config.channels[i]); 2729 | + param = &(c2c_param->ch_params[j]); 2730 | + 2731 | + /* defaults which are non-zero/non-null.*/ 2732 | + param->prod_event_id = DB_ID_NIL; 2733 | + param->cons_event_id = DB_ID_NIL; 2734 | + param->state_event_id = DB_ID_NIL; 2735 | + param->nframes = -1; 2736 | + 2737 | + /* if channel is not enabled, skip to next one.*/ 2738 | + if (cfg->enable != true) 2739 | + continue; 2740 | + 2741 | + param->ch_id = i; 2742 | + param->nframes = cfg->frames.nframes; 2743 | + param->frame_sz = cfg->frames.frame_sz; 2744 | + param->align = cfg->align; 2745 | + param->event_type = cfg->nevent.type; 2746 | + param->prod_event_id = cfg->nevent.id_1; 2747 | + param->cons_event_id = cfg->nevent.id_2; 2748 | + param->state_event_id = cfg->nevent.id_3; 2749 | + param->bulk_xfer_mode = cfg->bulk_xfer_mode; 2750 | + snprintf(param->ch_name, MAX_NAME_LEN, "%s_%d", 2751 | + MODULE_NAME, param->ch_id); 2752 | + j++; 2753 | + } 2754 | + 2755 | + /* if we couldn't find any enabled channels.*/ 2756 | + if (!j) { 2757 | + ret = -ENODATA; 2758 | + ERR("Failed to parse any enabled c2c channel in config\n"); 2759 | + goto err; 2760 | + } 2761 | + 2762 | + /* device/global settings which are not per-channel. */ 2763 | + c2c_param->channel_nr = j; 2764 | + c2c_param->req_mw_sz = c2c_config.req_mw_sz; 2765 | + if ((c2c_config.fixed_mw_addr >= 0x80000000) 2766 | + && (c2c_config.fixed_mw_sz)) { 2767 | + c2c_param->use_fixed_addr = true; 2768 | + c2c_param->fixed_mw_addr = c2c_config.fixed_mw_addr; 2769 | + c2c_param->fixed_mw_sz = c2c_config.fixed_mw_sz; 2770 | + } 2771 | + 2772 | + /* debug only. */ 2773 | + config_print(c2c_param); 2774 | + 2775 | + return ret; 2776 | + 2777 | +err: 2778 | + config_release(c2c_param); 2779 | + return ret; 2780 | +} 2781 | + 2782 | + 2783 | +/* 2784 | + * Debug only. 2785 | + */ 2786 | +static void config_print(struct c2c_param_t *c2c_param) 2787 | +{ 2788 | + int i = 0; 2789 | + 2790 | + DBG("\n"); 2791 | + DBG("C2C config file parsing leads to:\n"); 2792 | + DBG("\tmin_win_sz = (0x%08zX)\n", c2c_param->req_mw_sz); 2793 | + if (c2c_param->use_fixed_addr) { 2794 | + DBG("\tfixed_mw_addr = (0x%08llX)\n", 2795 | + c2c_param->fixed_mw_addr); 2796 | + DBG("\tfixed_mw_sz = (0x%08zX)\n", 2797 | + c2c_param->fixed_mw_sz); 2798 | + } 2799 | + DBG("\ttotal channels = (%u)\n", c2c_param->channel_nr); 2800 | + for (i = 0; i < c2c_param->channel_nr; i++) { 2801 | + struct channel_param_t *param = NULL; 2802 | + enum bulk_xfer_type xfer_mode = BULK_XFER_TYPE_NONE; 2803 | + 2804 | + param = &(c2c_param->ch_params[i]); 2805 | + xfer_mode = param->bulk_xfer_mode; 2806 | + DBG("\t\t(%s)::\n", param->ch_name); 2807 | + DBG("\t\t\tch_id = (%u)\n", param->ch_id); 2808 | + DBG("\t\t\talignment = (0x%x)\n", param->align); 2809 | + DBG("\t\t\tnframes = (0x%X) frame_size=(0x%08X)", 2810 | + param->nframes, param->frame_sz); 2811 | + DBG("\t\t\tnotification event::\n"); 2812 | + DBG("\t\t\t\ttype = (%s)\n", (param->event_type == 0) ? 2813 | + ("NTB doorbell"):("syncpoint")); 2814 | + DBG("\t\t\t\tprod_id = (0x%02X)\n", param->prod_event_id); 2815 | + DBG("\t\t\t\tcons_id = (0x%02X)\n", param->cons_event_id); 2816 | + DBG("\t\t\t\tstate_id = (0x%02X)\n", param->state_event_id); 2817 | + if (xfer_mode == BULK_XFER_TYPE_NONE) 2818 | + DBG("\t\t\tCPU xfer device\n"); 2819 | + else if (xfer_mode == BULK_XFER_TYPE_PRODUCER) 2820 | + DBG("\t\t\tBulk Producer device\n"); 2821 | + else if (xfer_mode == BULK_XFER_TYPE_CONSUMER) 2822 | + DBG("\t\t\tBulk Consumer device\n"); 2823 | + else if (xfer_mode == BULK_XFER_TYPE_PRODUCER_PCIE_READ) 2824 | + DBG("\t\t\tBulk Producer device using PCIe read\n"); 2825 | + else if (xfer_mode == BULK_XFER_TYPE_CONSUMER_PCIE_READ) 2826 | + DBG("\t\t\tBulk Consumer device using PCIe read\n"); 2827 | + } 2828 | + DBG("C2C config file parsing ends\n"); 2829 | + DBG("\n"); 2830 | +} 2831 | diff --git a/drivers/misc/nvscic2c/link-mgmt.c b/drivers/misc/nvscic2c/link-mgmt.c 2832 | new file mode 100644 2833 | index 0000000..02dd887 2834 | --- /dev/null 2835 | +++ b/drivers/misc/nvscic2c/link-mgmt.c 2836 | @@ -0,0 +1,579 @@ 2837 | +/* 2838 | + * Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved. 2839 | + * 2840 | + * This program is free software; you can redistribute it and/or modify it 2841 | + * under the terms and conditions of the GNU General Public License, 2842 | + * version 2, as published by the Free Software Foundation. 2843 | + * 2844 | + * This program is distributed in the hope it will be useful, but WITHOUT 2845 | + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 2846 | + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for 2847 | + * more details. 2848 | + */ 2849 | + 2850 | +#include "chip-to-chip.h" 2851 | +#include 2852 | +#include 2853 | +#include 2854 | +#include 2855 | +#include 2856 | +#include 2857 | +#include 2858 | +#include 2859 | +#include 2860 | +#include 2861 | + 2862 | + 2863 | +/* over-ride these so that: 2864 | + * - we do not instantiate as a platform device. 2865 | + * - we do not use NTB device for spews. 2866 | + * - judiciously use 80 column limit. 2867 | + * - add abstraction within module. 2868 | + */ 2869 | +#define ERR(...) pr_err(MODULE_NAME": link-mgmt:\t" __VA_ARGS__) 2870 | +#define INFO(...) pr_info(MODULE_NAME": link-mgmt:\t" __VA_ARGS__) 2871 | +#define DBG(...) pr_debug(MODULE_NAME": link-mgmt:\t" __VA_ARGS__) 2872 | + 2873 | + 2874 | +/* wait/trigger period between two heart beats.*/ 2875 | +#define HB_TIME_INTERVAL (1000) 2876 | + 2877 | +/* wait for these many heart-beats to be missed 2878 | + * to consider link down. 2879 | + */ 2880 | +#define HB_MISS_THRESH (4) 2881 | + 2882 | + 2883 | +/* 2884 | + * Overall context for the link-mgmt sub-module of nvscic2c module. 2885 | + * This is to meet the expectation of link-mgmt abstraction. 2886 | + */ 2887 | +struct link_mgmt_ctx_t { 2888 | + /* we export few functions, error out on them if this is not ready.*/ 2889 | + volatile bool initialised; 2890 | + 2891 | + /* callback ops to propagate status changed event.*/ 2892 | + struct link_mgmt_ops ops; 2893 | + 2894 | + /* link status - as received by NTB module - Read-only for link_mgmt.*/ 2895 | + atomic_t peer_status; 2896 | + 2897 | + /* count of link_event recvd.*/ 2898 | + atomic_t hb_counter; 2899 | + 2900 | + /* event miss count.*/ 2901 | + /* link monitor kthread identifier.*/ 2902 | + struct task_struct *monitor; 2903 | + volatile bool monitor_shutdown; 2904 | + struct completion monitor_shutdown_compl; 2905 | + 2906 | + /* monitor thread keeps waiting for link events.*/ 2907 | + wait_queue_head_t monitor_waitq; 2908 | + 2909 | + /* link trigger kthread identifier.*/ 2910 | + struct task_struct *trigger; 2911 | + volatile bool trigger_shutdown; 2912 | + struct completion trigger_shutdown_compl; 2913 | + 2914 | + /* timer for trigger thread.*/ 2915 | + wait_queue_head_t trigger_waitq; 2916 | + 2917 | + /* contains current link status for nvscic2c LKM and 2918 | + * nvscic2c user-space SW to refer to. Shall be mapped 2919 | + * in user-space, therefore must be PAGE_SIZE aligned. 2920 | + */ 2921 | + struct cpu_buff_t status_mem; 2922 | +}; 2923 | + 2924 | + 2925 | +/* prototype.*/ 2926 | +static int handle_status_changed(enum link_status status); 2927 | + 2928 | +/* prototype.*/ 2929 | +static int start_hb_trigger_task(void); 2930 | + 2931 | +/* prototype.*/ 2932 | +static int stop_hb_trigger_task(void); 2933 | + 2934 | +/* prototype.*/ 2935 | +static int monitor_taskfn(void *arg); 2936 | + 2937 | +/* prototype.*/ 2938 | +static int trigger_taskfn(void *arg); 2939 | + 2940 | + 2941 | +/* link_mgmt contxt. If not making it global here, add it to c2c_drv_ctx_t.*/ 2942 | +static struct link_mgmt_ctx_t *link_ctx; 2943 | + 2944 | + 2945 | +/* 2946 | + * Registered with NTB client abstraction(ntb-client.c). On every link 2947 | + * event received by peer for NTB link, this callback shall be invoked. 2948 | + * 2949 | + * The status received in this callback is fed to monitor thread of 2950 | + * link mgmt thread to decide change in link status. 2951 | + */ 2952 | +void link_mgmt_event_cb(enum link_status status) 2953 | +{ 2954 | + if ((!link_ctx) 2955 | + || (!link_ctx->initialised)) { 2956 | + return; 2957 | + } 2958 | + 2959 | + /* we received a link event.*/ 2960 | + atomic_inc(&(link_ctx->hb_counter)); 2961 | + atomic_set(&(link_ctx->peer_status), status); 2962 | + 2963 | + /* ask monitor thread to handle the hb.*/ 2964 | + wake_up_interruptible(&(link_ctx->monitor_waitq)); 2965 | +} 2966 | + 2967 | + 2968 | +/* 2969 | + * processing loop for link_mgmt module. 2970 | + * The tasks starts by sending LINK_UP to remote peer. 2971 | + * It then waits for a link event(UP or DOWN) from peer for HB_TIME_INTERVAL 2972 | + * (ms). if: 2973 | + * - Link event received within HB_TIME_INTERVAL, check for change in 2974 | + * link status w.r.t previous event. 2975 | + * - if changed, notify nvscic2c module to forward to channels. 2976 | + * - if no change, no action required. 2977 | + * - No link event received for consecutive HB_MISS_THRESH iterations, 2978 | + * we deduce remote went away abruptly and forward to channels. 2979 | + */ 2980 | +static int monitor_taskfn(void *arg) 2981 | +{ 2982 | + int ret = 0; 2983 | + uint32_t hb_missed = 0; 2984 | + bool status_changed = false; 2985 | + enum link_status peer_status = LINK_DOWN; 2986 | + enum link_status self_status = LINK_DOWN; 2987 | + 2988 | + DBG("starting link monitor thread\n"); 2989 | + 2990 | + /* start by signalling remote we are up.*/ 2991 | + ntb_client_set_link_status(LINK_UP); 2992 | + 2993 | + while (!link_ctx->monitor_shutdown) { 2994 | + /* wait for hb/time-out to occur. */ 2995 | + ret = wait_event_interruptible_timeout(link_ctx->monitor_waitq, 2996 | + (atomic_read(&(link_ctx->hb_counter)) 2997 | + || (link_ctx->monitor_shutdown)), 2998 | + msecs_to_jiffies(HB_TIME_INTERVAL)); 2999 | + /* it comes out of wait in following cases: 3000 | + * - shutdown (locally done). 3001 | + * - link event/hb from remote. 3002 | + * - timed-out waiting for hb. 3003 | + * - unexpected error from wait_. 3004 | + */ 3005 | + if (link_ctx->monitor_shutdown) { 3006 | + /* thread is exiting.*/ 3007 | + continue; 3008 | + } else if (ret > 0) { 3009 | + /* we got a link event / hb.*/ 3010 | + hb_missed = 0; 3011 | + atomic_dec(&link_ctx->hb_counter); 3012 | + peer_status = atomic_read(&(link_ctx->peer_status)); 3013 | + if (self_status != peer_status) { 3014 | + self_status = peer_status; 3015 | + status_changed = true; 3016 | + } 3017 | + } else if (ret == 0) { 3018 | + /* timedout waiting for hb.*/ 3019 | + hb_missed++; 3020 | + /* hb missed for long. set ourselves to LINK_DOWN. */ 3021 | + if (hb_missed >= HB_MISS_THRESH) { 3022 | + hb_missed = 0; 3023 | + if (self_status == LINK_UP) { 3024 | + self_status = LINK_DOWN; 3025 | + status_changed = true; 3026 | + } 3027 | + } 3028 | + } else { 3029 | + /* unhandled error.*/ 3030 | + ERR("link-monitor-thread: Unexpected error\n"); 3031 | + link_ctx->monitor_shutdown = true; 3032 | + self_status = LINK_DOWN; 3033 | + status_changed = true; 3034 | + } 3035 | + 3036 | + /* handle any status change.*/ 3037 | + if (status_changed == true) { 3038 | + handle_status_changed(self_status); 3039 | + status_changed = false; 3040 | + } 3041 | + } 3042 | + 3043 | + /* signal all channels our link is down now.*/ 3044 | + handle_status_changed(LINK_DOWN); 3045 | + 3046 | + /* signal remote we are going away.*/ 3047 | + ntb_client_set_link_status(LINK_DOWN); 3048 | + 3049 | + DBG("exiting link monitor thread\n"); 3050 | + 3051 | + /* we do not use kthread_stop(), but wait on this.*/ 3052 | + complete(&(link_ctx->monitor_shutdown_compl)); 3053 | + 3054 | + return 0; 3055 | +} 3056 | + 3057 | + 3058 | +/* 3059 | + * Thread function for triggering heart-beats to remote over NTB. 3060 | + * 3061 | + * This thread keeps itself block on struct completion for HB_TIME_INTERVAL 3062 | + * milli-seconds and when wakes up, triggers remote/peer that it's Link 3063 | + * is still UP. 3064 | + * 3065 | + * This task shall send link UP HB(s) at regular intervals. This has no 3066 | + * intelligence but only to send HB events to remote. Therefore, when 3067 | + * to start or stop is done by link monitor task. 3068 | + */ 3069 | +static int trigger_taskfn(void *arg) 3070 | +{ 3071 | + int ret = 0; 3072 | + 3073 | + if (!link_ctx) { 3074 | + ERR("(%s): link mgmt ctx not ready yet\n", __func__); 3075 | + return -EINVAL; 3076 | + } 3077 | + 3078 | + DBG("starting link trigger thread\n"); 3079 | + 3080 | + while (!link_ctx->trigger_shutdown) { 3081 | + /* wait for link event to occur. */ 3082 | + ret = wait_event_interruptible_timeout(link_ctx->trigger_waitq, 3083 | + link_ctx->trigger_shutdown, 3084 | + msecs_to_jiffies(HB_TIME_INTERVAL)); 3085 | + /* check for shutdown.*/ 3086 | + if (link_ctx->trigger_shutdown) { 3087 | + /* thread is exiting.*/ 3088 | + continue; 3089 | + } 3090 | + 3091 | + /* timer timed-out, trigger hb.*/ 3092 | + ntb_client_set_link_status(LINK_UP); 3093 | + } 3094 | + 3095 | + DBG("exiting link trigger thread\n"); 3096 | + 3097 | + /* we do not use kthread_stop(), but wait on this.*/ 3098 | + complete(&(link_ctx->trigger_shutdown_compl)); 3099 | + 3100 | + return 0; 3101 | +} 3102 | + 3103 | + 3104 | +/* 3105 | + * helper API to complete the required steps to handle change 3106 | + * in NTB link status. We start with: 3107 | + * - updating the status_mem which is shared with user-space SW. 3108 | + * - notify the nvscic2c module about change in link status to 3109 | + * be forwarded to all channels. 3110 | + * - start/stop link event trigger thread. 3111 | + */ 3112 | +static int handle_status_changed(enum link_status status) 3113 | +{ 3114 | + /* update global status memory.*/ 3115 | + *((enum link_status *)(link_ctx->status_mem.pva)) = status; 3116 | + 3117 | + /* notify channels about change in link status.*/ 3118 | + if (link_ctx->ops.link_status_changed) 3119 | + link_ctx->ops.link_status_changed(status, link_ctx->ops.ctx); 3120 | + 3121 | + /* start or stop link trigger thread.*/ 3122 | + if (status == LINK_UP) 3123 | + start_hb_trigger_task(); 3124 | + else 3125 | + stop_hb_trigger_task(); 3126 | + 3127 | + return 0; 3128 | +} 3129 | + 3130 | + 3131 | +/* 3132 | + * helper functions to start triggering the link heart-beat. 3133 | + * 3134 | + * This task shall send link UP HB(s) at regular intervals. This has no 3135 | + * intelligence but only to send HB events to remote. Therefore, when 3136 | + * to start or stop is done by link monitor task. 3137 | + */ 3138 | +static int start_hb_trigger_task(void) 3139 | +{ 3140 | + char thread_name[MAX_NAME_LEN] = {'\0'}; 3141 | + 3142 | + /* skipping check for link_ctx. internal api.*/ 3143 | + 3144 | + /* err: if running already.*/ 3145 | + if (link_ctx->trigger) { 3146 | + ERR("trigger thread running already\n"); 3147 | + return -EINVAL; 3148 | + } 3149 | + 3150 | + /* start the link trigger thread.*/ 3151 | + snprintf(thread_name, (MAX_NAME_LEN - 1), "%s-link-trigger", 3152 | + MODULE_NAME); 3153 | + link_ctx->trigger = kthread_run(trigger_taskfn, link_ctx, 3154 | + thread_name); 3155 | + if (IS_ERR_OR_NULL(link_ctx->trigger)) { 3156 | + ERR("Failed to create link trigger task\n"); 3157 | + return PTR_ERR(link_ctx->trigger); 3158 | + } 3159 | + 3160 | + return 0; 3161 | +} 3162 | + 3163 | + 3164 | +/* 3165 | + * Stop the link HB trigger task. Tyically, link monitor thread will 3166 | + * stop the link trigger task on seeing that peer has went away. 3167 | + * so that we do not send any more link UP events. 3168 | + */ 3169 | +static int stop_hb_trigger_task(void) 3170 | +{ 3171 | + int ret = 0; 3172 | + 3173 | + /* skipping check for link_ctx. internal api.*/ 3174 | + 3175 | + /* stopped already.*/ 3176 | + if (IS_ERR_OR_NULL(link_ctx->trigger)) 3177 | + return ret; 3178 | + 3179 | + /* initiate stop.*/ 3180 | + link_ctx->trigger_shutdown = true; 3181 | + wake_up_interruptible(&(link_ctx->trigger_waitq)); 3182 | + 3183 | + /* wait for thread to complete.*/ 3184 | + wait_for_completion_interruptible(&(link_ctx->trigger_shutdown_compl)); 3185 | + 3186 | + /* prepare to start again when asked.*/ 3187 | + link_ctx->trigger = NULL; 3188 | + link_ctx->trigger_shutdown = false; 3189 | + reinit_completion(&(link_ctx->trigger_shutdown_compl)); 3190 | + 3191 | + DBG("link trigger thread stopped\n"); 3192 | + 3193 | + return ret; 3194 | +} 3195 | + 3196 | + 3197 | +/* 3198 | + * Entry point for the link management sub-module/abstraction. 3199 | + * 3200 | + * Link mgmt abstraction keeps track of previous link state. If this 3201 | + * changes to a different state than previous state, we invoke a callback 3202 | + * nvscic2c module(module.c) registers with link_mgmt to notify all channels 3203 | + * for link status change. 3204 | + * 3205 | + * On successful return (0), link monitoring thread would have been created 3206 | + * which sends the first link UP event to remote and then waits for Link UP 3207 | + * heart beats. 3208 | + * 3209 | + * There is a trigger thread which keeps sending link HB(s) which gets 3210 | + * started only when we receive link UP event from remote. 3211 | + * 3212 | + * Also allocates the link status memory which nvscic2c dev exports to user- 3213 | + * space via mmap. 3214 | + * 3215 | + * CAVEAT: nvscic2c dev nodes export mmap which shall map link status memory 3216 | + * to user-space. We would start this link_mgmt once all devices are created 3217 | + * but we also allocate this status memory when starting link_mgmt thread. 3218 | + * Hence, there would be a small window where the nvscic2c devices have been 3219 | + * setup and mmap is exported but link_mgmt module may not be ready yet. 3220 | + */ 3221 | +int link_mgmt_start(struct link_mgmt_ops *ops) 3222 | +{ 3223 | + int ret = 0; 3224 | + struct cpu_buff_t status_mem = {0}; 3225 | + char thread_name[MAX_NAME_LEN] = {'\0'}; 3226 | + 3227 | + /* args check.*/ 3228 | + if ((!ops) 3229 | + || (!ops->link_status_changed)) { 3230 | + ret = -EINVAL; 3231 | + ERR("(%s): Invalid function args passed\n", __func__); 3232 | + goto err; 3233 | + } 3234 | + 3235 | + /* if already instantiated.*/ 3236 | + if (link_ctx) { 3237 | + ret = -EALREADY; 3238 | + ERR("(%s): Link mgmt context already instantiated\n", 3239 | + __func__); 3240 | + goto err; 3241 | + } 3242 | + 3243 | + /* start by allocating context.*/ 3244 | + link_ctx = kzalloc(sizeof(*link_ctx), GFP_KERNEL); 3245 | + if (!link_ctx) { 3246 | + ret = -ENOMEM; 3247 | + ERR("Failed to allocate Link mgmt context.\n"); 3248 | + goto err; 3249 | + } 3250 | + link_ctx->ops.link_status_changed = ops->link_status_changed; 3251 | + link_ctx->ops.ctx = ops->ctx; 3252 | + 3253 | + /* allocate the link status memory which user-space SW can 3254 | + * map and read it to know the link status. Therefore allocate 3255 | + * worth PAGE_SIZE aligned. 3256 | + */ 3257 | + status_mem.size = PAGE_ALIGN(sizeof(enum link_status)); 3258 | + status_mem.pva = kzalloc(status_mem.size, GFP_KERNEL); 3259 | + if (!status_mem.pva) { 3260 | + ret = -ENOMEM; 3261 | + ERR("Failed to allocate link status memory.\n"); 3262 | + goto err; 3263 | + } 3264 | + *((enum link_status *)(status_mem.pva)) = LINK_DOWN; 3265 | + status_mem.phys_addr = virt_to_phys(status_mem.pva); 3266 | + memcpy(&(link_ctx->status_mem), &(status_mem), sizeof(status_mem)); 3267 | + 3268 | + /* initialise internals.*/ 3269 | + link_ctx->monitor_shutdown = false; 3270 | + link_ctx->trigger_shutdown = false; 3271 | + atomic_set(&(link_ctx->hb_counter), 0); 3272 | + atomic_set(&(link_ctx->peer_status), LINK_DOWN); 3273 | + init_waitqueue_head(&(link_ctx->monitor_waitq)); 3274 | + init_waitqueue_head(&(link_ctx->trigger_waitq)); 3275 | + init_completion(&(link_ctx->monitor_shutdown_compl)); 3276 | + init_completion(&(link_ctx->trigger_shutdown_compl)); 3277 | + 3278 | + /* start the link monitor thread.*/ 3279 | + snprintf(thread_name, (MAX_NAME_LEN - 1), "%s-link-moitor", 3280 | + MODULE_NAME); 3281 | + link_ctx->monitor = kthread_run(monitor_taskfn, link_ctx, 3282 | + thread_name); 3283 | + if (IS_ERR_OR_NULL(link_ctx->monitor)) { 3284 | + ERR("Failed to create link monitor task\n"); 3285 | + ret = PTR_ERR(link_ctx->monitor); 3286 | + goto err; 3287 | + } 3288 | + 3289 | + /* all okay. */ 3290 | + link_ctx->initialised = true; 3291 | + return ret; 3292 | + 3293 | +err: 3294 | + link_mgmt_stop(); 3295 | + return ret; 3296 | +} 3297 | + 3298 | + 3299 | +/* exit point for link mgmt sub-module/abstraction.*/ 3300 | +int link_mgmt_stop(void) 3301 | +{ 3302 | + int ret = 0; 3303 | + 3304 | + if (!link_ctx) 3305 | + return ret; 3306 | + 3307 | + /* we are dying now, accept no more calls from other abstractions.*/ 3308 | + link_ctx->initialised = false; 3309 | + 3310 | + /* stop link monitor thread.*/ 3311 | + if (!IS_ERR_OR_NULL(link_ctx->monitor)) { 3312 | + /* initiate stop.*/ 3313 | + link_ctx->monitor_shutdown = true; 3314 | + wake_up_interruptible(&(link_ctx->monitor_waitq)); 3315 | + 3316 | + /* wait for thread to complete.*/ 3317 | + wait_for_completion_interruptible( 3318 | + &(link_ctx->monitor_shutdown_compl)); 3319 | + DBG("link monitor thread stopped\n"); 3320 | + } 3321 | + 3322 | + kfree(link_ctx->status_mem.pva); 3323 | + link_ctx->status_mem.pva = NULL; 3324 | + 3325 | + kfree(link_ctx); 3326 | + link_ctx = NULL; 3327 | + 3328 | + return ret; 3329 | +} 3330 | + 3331 | + 3332 | +/* 3333 | + * used by channel abstraction to query the current link status. 3334 | + * for poll() implementation. 3335 | + */ 3336 | +enum link_status link_mgmt_get_link_status(void) 3337 | +{ 3338 | + enum link_status status = LINK_DOWN; 3339 | + 3340 | + /* if called without calling link_mgmt_setup().*/ 3341 | + if ((!link_ctx) 3342 | + || (!link_ctx->initialised)) { 3343 | + ERR("(%s): link mgmt ctx not ready yet\n", __func__); 3344 | + return status; 3345 | + } 3346 | + 3347 | + status = *((enum link_status *)(link_ctx->status_mem.pva)); 3348 | + return status; 3349 | +} 3350 | + 3351 | + 3352 | +/* 3353 | + * used by channel abstraction to query the size of the link status 3354 | + * memory that shall be exported to userspace SW. for ioctl()/mmap() 3355 | + * implementation. 3356 | + * 3357 | + * If nvscic2c module has called link_mgmt_release(), it must not 3358 | + * refer link_status_mem anytime after that. 3359 | + */ 3360 | +size_t link_mgmt_get_status_mem_size(void) 3361 | +{ 3362 | + /* if called without calling link_mgmt_setup().*/ 3363 | + if ((!link_ctx) 3364 | + || (!link_ctx->initialised)) { 3365 | + ERR("(%s): link mgmt ctx not ready yet\n", __func__); 3366 | + return 0; 3367 | + } 3368 | + 3369 | + return link_ctx->status_mem.size; 3370 | +} 3371 | + 3372 | + 3373 | +/* 3374 | + * helper function to mmap the link status memory for an nvscic2c 3375 | + * dev node. This is required because the channel abstraction doesn't 3376 | + * keep the link status memory credentials with itself. 3377 | + * 3378 | + * If nvscic2c module has called link_mgmt_release(), it must not 3379 | + * refer link_status_mem anytime after that. 3380 | + */ 3381 | +int link_mgmt_mmap_status_mem(struct vm_area_struct *vma) 3382 | +{ 3383 | + int ret = 0; 3384 | + 3385 | + /* args check.*/ 3386 | + if (!vma) { 3387 | + ret = -EINVAL; 3388 | + ERR("(%s): Function args invalid\n", __func__); 3389 | + goto err; 3390 | + } 3391 | + 3392 | + /* if called without calling link_mgmt_setup().*/ 3393 | + if ((!link_ctx) 3394 | + || (!link_ctx->initialised)) { 3395 | + ret = -EINVAL; 3396 | + ERR("(%s): link mgmt ctx not ready yet\n", __func__); 3397 | + goto err; 3398 | + } 3399 | + 3400 | + /* check again.*/ 3401 | + if ((vma->vm_end - vma->vm_start) != link_ctx->status_mem.size) { 3402 | + ret = -EINVAL; 3403 | + ERR("(%s): link_mgmt status mem mmap size mismatch\n", 3404 | + __func__); 3405 | + goto err; 3406 | + } 3407 | + 3408 | + /* add remap_pfn_range(). here. */ 3409 | + ret = remap_pfn_range(vma, vma->vm_start, 3410 | + PFN_DOWN(link_ctx->status_mem.phys_addr), 3411 | + link_ctx->status_mem.size, 3412 | + vma->vm_page_prot); 3413 | +err: 3414 | + return ret; 3415 | +} 3416 | diff --git a/drivers/misc/nvscic2c/module.c b/drivers/misc/nvscic2c/module.c 3417 | new file mode 100644 3418 | index 0000000..156a6de 3419 | --- /dev/null 3420 | +++ b/drivers/misc/nvscic2c/module.c 3421 | @@ -0,0 +1,304 @@ 3422 | +/* 3423 | + * Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved. 3424 | + * 3425 | + * This program is free software; you can redistribute it and/or modify it 3426 | + * under the terms and conditions of the GNU General Public License, 3427 | + * version 2, as published by the Free Software Foundation. 3428 | + * 3429 | + * This program is distributed in the hope it will be useful, but WITHOUT 3430 | + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 3431 | + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for 3432 | + * more details. 3433 | + */ 3434 | + 3435 | +/* 3436 | + * This is Chip-To-Chip(Host-To-Host) interconnect module built 3437 | + * on MicroSemi/Microchip NTB bridge module. It is used to achieve data(small, 3438 | + * bulk) transfers via CPU Reads/Writes or remote DMA Reads/Writes over PCIe 3439 | + * wire from PCIe Host(x86) to another(Tegra). RP(Root-Port) <-> RootPort only. 3440 | + * 3441 | + * This sample is specifically written for NT mode support in Microsemi PLX 3442 | + * switch hosted on NVIDIA DRIVE Platform. The NT-EP ports (both), domains must 3443 | + * have the identical BAR sizes burnt. 3444 | + */ 3445 | + 3446 | +#include "chip-to-chip.h" 3447 | +#include 3448 | +#include 3449 | +#include 3450 | +#include 3451 | +#include 3452 | + 3453 | + 3454 | +#define DRIVER_LICENSE "GPL v2" 3455 | +#define DRIVER_DESCRIPTION "Host-To-Host data transfer module over PCIe" 3456 | +#define DRIVER_VERSION "1.0" 3457 | +#define DRIVER_RELDATE "January 2019" 3458 | +#define DRIVER_AUTHOR "Nvidia Corporation" 3459 | +#define DRIVER_NAME MODULE_NAME 3460 | +MODULE_DESCRIPTION(DRIVER_DESCRIPTION); 3461 | +MODULE_LICENSE(DRIVER_LICENSE); 3462 | +MODULE_VERSION(DRIVER_VERSION); 3463 | +MODULE_AUTHOR(DRIVER_AUTHOR); 3464 | + 3465 | +/* 3466 | + * module parameter that user can supply for the reserved physical address 3467 | + * for PCIe shared memory. - This can also be provided in config.c but 3468 | + * if provided here shall over-ride the value given in c2c_config. 3469 | + * 3470 | + * ASSUMING THIS WAY IS WHEN PLATFORM DOESN'T SUPPORT IOMMU. 3471 | + */ 3472 | +static ulong fixed_mw_addr; 3473 | +module_param(fixed_mw_addr, ulong, 0644); 3474 | +MODULE_PARM_DESC(fixed_mw_addr, 3475 | + "Physical address reserved for PCIe shared mem on BAR4"); 3476 | + 3477 | +/* module parameter for the size that user can give for reserved physical 3478 | + * memory. - This can also be provided in config.c but 3479 | + * if provided here shall over-ride the value given in c2c_config. 3480 | + * 3481 | + * ASSUMING THIS WAY IS WHEN PLATFORM DOESN'T SUPPORT IOMMU. 3482 | + */ 3483 | +static ulong fixed_mw_size; 3484 | +module_param(fixed_mw_size, ulong, 0644); 3485 | +MODULE_PARM_DESC(fixed_mw_size, 3486 | + "Size of reserved memory for PCIe shared mem on BAR4"); 3487 | + 3488 | + 3489 | +/* over-ride these so that: 3490 | + * - we do not instantiate as a platform device. 3491 | + * - we do not use NTB device for spews. 3492 | + * - judiciously use 80 column limit. 3493 | + * - add abstraction within module. 3494 | + */ 3495 | +#define ERR(...) pr_err(MODULE_NAME": module:\t" __VA_ARGS__) 3496 | +#define INFO(...) pr_info(MODULE_NAME": module:\t" __VA_ARGS__) 3497 | +#define DBG(...) pr_debug(MODULE_NAME": module:\t" __VA_ARGS__) 3498 | + 3499 | + 3500 | +/* Global nvscic2c module/driver context. */ 3501 | +static struct c2c_drv_ctx_t *drv_ctx; 3502 | + 3503 | + 3504 | +/* 3505 | + * Debug only. Print Rx(Self memory) PCIe shared memory 3506 | + * and Tx(Peer memory)-PCIe aperture props. 3507 | + */ 3508 | +static void print_mem_info(struct dma_buff_t *self_mem, 3509 | + struct pci_mmio_t *peer_mem) 3510 | +{ 3511 | + if ((self_mem != NULL) 3512 | + && (peer_mem != NULL)) { 3513 | + DBG("\n"); 3514 | + DBG("Total Peer-memory: Tx memory:\n"); 3515 | + DBG("\t\taper:(%pa[p])\n", &(peer_mem->aper)); 3516 | + DBG("\t\tsize:(0x%016zx)\n", peer_mem->size); 3517 | + 3518 | + DBG("Total Self-memory: Rx memory:\n"); 3519 | + DBG("\t\tiova:(%pa[d])\n", &(self_mem->dma_handle)); 3520 | + DBG("\t\tsize:(0x%016zx)\n", self_mem->size); 3521 | + DBG("\n"); 3522 | + } 3523 | +} 3524 | + 3525 | + 3526 | +/* 3527 | + * callback nvscic2c module registers with link_mgmt abstraction 3528 | + * to be invoked everytime NTB link status is changed. On getting 3529 | + * invoked we forward it to channel abstraction, so that each channel 3530 | + * can check the NTB link status in it's poll() implementation. 3531 | + */ 3532 | +static void link_status_changed(enum link_status status, void *ctx) 3533 | +{ 3534 | + channel_link_event(status); 3535 | +} 3536 | + 3537 | + 3538 | +/* entry point for the driver/module. 3539 | + * - We can work withot NTB driver with remote mempory addresses available, 3540 | + * For E.g: R.P <-> E.P with propietary notification mechanism and therefore, 3541 | + * our probe is not called from NTB driver so that we can remove NTB dependency 3542 | + * easily. 3543 | + */ 3544 | +static int module_probe(void) 3545 | +{ 3546 | + int ret = 0; 3547 | + struct c2c_param_t *c2c_param = NULL; 3548 | + struct link_mgmt_ops link_ops = {0}; 3549 | + 3550 | + DBG("(%s): Entering\n", __func__); 3551 | + 3552 | + /* start by allocating the global driver ctx.*/ 3553 | + drv_ctx = kzalloc(sizeof(*drv_ctx), GFP_KERNEL); 3554 | + if (drv_ctx == NULL) { 3555 | + ret = -ENOMEM; 3556 | + ERR("Failed to allocate driver ctx\n"); 3557 | + goto err_alloc; 3558 | + } 3559 | + c2c_param = &(drv_ctx->c2c_param); 3560 | + 3561 | + /* parse the configurable options for c2c module: channels, 3562 | + * fixed memory, etc. 3563 | + */ 3564 | + ret = config_parse(c2c_param); 3565 | + if (ret != 0) { 3566 | + ERR("Failed to parse c2c module config options.\n"); 3567 | + goto err_config; 3568 | + } 3569 | + 3570 | + /* if user had provided fixed physical address as mw backing, 3571 | + * over-ride the same config provided in config.c 3572 | + */ 3573 | + if ((fixed_mw_addr >= 0x80000000) 3574 | + && (fixed_mw_size)) { 3575 | + INFO("Using (0x%08lx+0x%08zx) for NTB PCIe shared mw\n", 3576 | + fixed_mw_addr, fixed_mw_size); 3577 | + 3578 | + c2c_param->fixed_mw_addr = fixed_mw_addr; 3579 | + c2c_param->fixed_mw_sz = fixed_mw_size; 3580 | + c2c_param->use_fixed_addr = true; 3581 | + } 3582 | + 3583 | + 3584 | + /* if no enabled C2C channels were enabled.*/ 3585 | + if (c2c_param->channel_nr <= 0) { 3586 | + ret = -EINVAL; 3587 | + ERR("No C2C enabled channels parsed in config.c. Exiting\n"); 3588 | + goto err_ch_nr; 3589 | + } 3590 | + 3591 | + /* Window size for fixed addresses an min should be aligned to 3592 | + * PAGE_SIZE to fragment into nvscic2c channels on PAGE_SIZE 3593 | + * boundaries. 3594 | + */ 3595 | + if ((c2c_param->use_fixed_addr) 3596 | + && (c2c_param->fixed_mw_sz < c2c_param->req_mw_sz)) { 3597 | + ret = -ENOMEM; 3598 | + ERR("Fixed mw sz:(0x%08zx) less than requested:(0x%08zx).\n", 3599 | + c2c_param->fixed_mw_sz, c2c_param->req_mw_sz); 3600 | + goto err_fixed_sz; 3601 | + } 3602 | + 3603 | + /* register with NTB module as NTB client. 3604 | + * we register and wait for probe to get called to isolate 3605 | + * NTB dependency. 3606 | + */ 3607 | + ret = ntb_client_register(drv_ctx); 3608 | + if (ret != 0x0) { 3609 | + ERR("Failed to register nvscic2c as NTB client\n"); 3610 | + goto err_ntb_register; 3611 | + } 3612 | + 3613 | + /* query the overall PCIe rx(PCIe shared mem) and tx(PCIe aperture) 3614 | + * memory from ntb client. We have this call if we had to work without 3615 | + * NTB and use EP BAR address directly. 3616 | + */ 3617 | + ret = ntb_client_query_mem_info(&(drv_ctx->self_mem), 3618 | + &(drv_ctx->peer_mem)); 3619 | + if (ret) { 3620 | + ERR("Failed to query self and peer mem windows\n"); 3621 | + goto err_query_mem; 3622 | + } 3623 | + print_mem_info(&(drv_ctx->self_mem), &(drv_ctx->peer_mem)); 3624 | + 3625 | + /* create the nvscic2c devices now.*/ 3626 | + ret = channel_setup_devices(drv_ctx); 3627 | + if (ret) { 3628 | + ERR("Failed to setup the nvscic2c devices\n"); 3629 | + goto err_devices; 3630 | + } 3631 | + 3632 | + /* register with link management abstration, we should expect 3633 | + * callback on every link state change. This sends the first 3634 | + * NTB link UP to remote also. 3635 | + */ 3636 | + link_ops.link_status_changed = link_status_changed; 3637 | + ret = link_mgmt_start(&(link_ops)); 3638 | + if (ret) { 3639 | + ERR("Failed to initialise link_mgmt abstraction\n"); 3640 | + goto err_link; 3641 | + } 3642 | + 3643 | + DBG("(%s): Loaded module\n", __func__); 3644 | + return ret; 3645 | + 3646 | +err_link: 3647 | + channel_release_devices(drv_ctx); 3648 | + 3649 | +err_devices: 3650 | +err_query_mem: 3651 | + ntb_client_unregister(drv_ctx); 3652 | + 3653 | +err_ntb_register: 3654 | +err_fixed_sz: 3655 | +err_ch_nr: 3656 | + config_release(&(drv_ctx->c2c_param)); 3657 | + 3658 | +err_config: 3659 | + kfree(drv_ctx); 3660 | + drv_ctx = NULL; 3661 | + 3662 | +err_alloc: 3663 | + DBG("(%s): Error: Exiting\n", __func__); 3664 | + return ret; 3665 | +} 3666 | + 3667 | + 3668 | +/* exit point for the driver/module.*/ 3669 | +static int module_remove(void) 3670 | +{ 3671 | + int ret = 0; 3672 | + 3673 | + DBG("(%s): Entering\n", __func__); 3674 | + 3675 | + if (drv_ctx == NULL) { 3676 | + DBG("(%s): Exiting\n", __func__); 3677 | + return ret; 3678 | + } 3679 | + 3680 | + /* This signals LINK_DOWN to peer, must be first call. */ 3681 | + link_mgmt_stop(); 3682 | + 3683 | + /* release the channel devices.*/ 3684 | + channel_release_devices(drv_ctx); 3685 | + 3686 | + /* deregister as NTB client.*/ 3687 | + ntb_client_unregister(drv_ctx); 3688 | + 3689 | + /* release the config options.*/ 3690 | + config_release(&(drv_ctx->c2c_param)); 3691 | + 3692 | + kfree(drv_ctx); 3693 | + drv_ctx = NULL; 3694 | + 3695 | + DBG("(%s): Exiting\n", __func__); 3696 | + return ret; 3697 | +} 3698 | + 3699 | + 3700 | +/* 3701 | + * for it to be loadable, it must be loaded only after 3702 | + * ntb, ntb_hw_switchtec modules have been loaded successfully. 3703 | + */ 3704 | +static int __init nvscic2c_init(void) 3705 | +{ 3706 | + int ret = 0; 3707 | + 3708 | + ret = module_probe(); 3709 | + return ret; 3710 | +} 3711 | + 3712 | + 3713 | +#if IS_MODULE(CONFIG_NVSCIC2C) 3714 | +static void __exit nvscic2c_exit(void) 3715 | +{ 3716 | + DBG("(%s): Entering\n", __func__); 3717 | + module_remove(); 3718 | + DBG("(%s): Exiting\n", __func__); 3719 | +} 3720 | + 3721 | +module_init(nvscic2c_init); 3722 | +module_exit(nvscic2c_exit); 3723 | +#else 3724 | +late_initcall(nvscic2c_init); 3725 | +#endif 3726 | diff --git a/drivers/misc/nvscic2c/ntb-client.c b/drivers/misc/nvscic2c/ntb-client.c 3727 | new file mode 100644 3728 | index 0000000..fca7681 3729 | --- /dev/null 3730 | +++ b/drivers/misc/nvscic2c/ntb-client.c 3731 | @@ -0,0 +1,744 @@ 3732 | +/* 3733 | + * Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved. 3734 | + * 3735 | + * This program is free software; you can redistribute it and/or modify it 3736 | + * under the terms and conditions of the GNU General Public License, 3737 | + * version 2, as published by the Free Software Foundation. 3738 | + * 3739 | + * This program is distributed in the hope it will be useful, but WITHOUT 3740 | + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 3741 | + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for 3742 | + * more details. 3743 | + */ 3744 | + 3745 | +#include "chip-to-chip.h" 3746 | +#include 3747 | +#include 3748 | +#include 3749 | +#include 3750 | +#include 3751 | +#include 3752 | +#include 3753 | +#include 3754 | +#include 3755 | +#include 3756 | +#include 3757 | + 3758 | +/* 3759 | + * Of the capabilities of NTB, we do not use: 3760 | + * - SPAD (self, peer, rd/wr both) 3761 | + * - Peer DB read. 3762 | + * - LUT for C2C data (NTB internally may still be using LUT) 3763 | + * - Based on NTB == (k-4.15.0.39-generic || k-4.9-tegra) 3764 | + */ 3765 | + 3766 | + 3767 | +/* over-ride these so that: 3768 | + * - we do not instantiate as a platform device. 3769 | + * - we do not use NTB device for spews. 3770 | + * - judiciously use 80 column limit. 3771 | + * - add abstraction within module. 3772 | + */ 3773 | +#define ERR(...) pr_err(MODULE_NAME": ntb-client:\t" __VA_ARGS__) 3774 | +#define INFO(...) pr_info(MODULE_NAME": ntb-client:\t" __VA_ARGS__) 3775 | +#define DBG(...) pr_debug(MODULE_NAME": ntb-client:\t" __VA_ARGS__) 3776 | + 3777 | + 3778 | +/* Switchtec NTB port from MicroSemi (and useCase_04.pmc) has maximum 3779 | + * 2 memory windows (ID:0 on BAR2 and ID:1 on BAR4). Memory Window:0 will 3780 | + * have first 2MB (32 LUT * 64k) reserved. We can still use Memory Window:1 3781 | + * completely for C2C channel's data and control information. 3782 | + */ 3783 | +#define DATA_MW_ID (1) 3784 | + 3785 | + 3786 | +/* 3787 | + * ntb_client_t 3788 | + * 3789 | + * Internal Private data structure as NTB client. ntb_client_register() 3790 | + * would not accept any driver context. Therefore, we allocate a minimal 3791 | + * NTB client context here which contains backreference to driver context 3792 | + * nvscic2c and pass it around as needed. 3793 | + */ 3794 | +struct ntb_client_t { 3795 | + /* we our NTB client driver. For registeration with NTB driver. */ 3796 | + struct ntb_dev *ntbdev; 3797 | + 3798 | + /* for async probe completion before we return back 3799 | + * to nvscic2c probe, NTB probe should be completed. 3800 | + */ 3801 | + struct completion probe_cpl; 3802 | + 3803 | + /* size of mw requested by nvscic2c driver(config.c or DT). 3804 | + * proportional to channels supported. 3805 | + */ 3806 | + size_t req_mw_sz; 3807 | + 3808 | + /* allowed doorbells as queried from NTB. */ 3809 | + uint32_t db_valid_mask; 3810 | + 3811 | + /* Receive area: PCIe shared mem. Peer's Rd/Wr reflect here. */ 3812 | + struct dma_buff_t self_mem; 3813 | + 3814 | + /* Transmit area: PCIe aperture. Self's Rd/Wr to Peer go via this.*/ 3815 | + struct pci_mmio_t peer_mem; 3816 | + 3817 | + /* fixed physical address to be used for setting up PCIe shared mem. 3818 | + * Parameter given while loading module shall over-ride the same 3819 | + * setting provided in config.c file. 3820 | + */ 3821 | + uint64_t fixed_mw_addr; 3822 | + size_t fixed_mw_sz; 3823 | + bool use_fixed_addr; 3824 | +}; 3825 | + 3826 | + 3827 | +/* 3828 | + * NTB client context. We could have allocated in 3829 | + * ntb client driver probe(), but we need to associate c2c driver context 3830 | + * with ntb client driver. NTB driver doesn't accept any ctx during 3831 | + * registration. Hence we need this global here. 3832 | + */ 3833 | +static struct ntb_client_t *ntb_ctx; 3834 | + 3835 | + 3836 | +/* 3837 | + * Wrapper over corresponding NTB api, added to maintain abstraction: 3838 | + * channel functionality doesn't call directly into NTB apis 3839 | + * but only via ntb-client. 3840 | + */ 3841 | +int ntb_client_db_vector_count(void) 3842 | +{ 3843 | + if ((!ntb_ctx) 3844 | + || (!ntb_ctx->ntbdev)) { 3845 | + ERR("(%s): ntb client not ready\n", __func__); 3846 | + return -EINVAL; 3847 | + } 3848 | + 3849 | + return ntb_db_vector_count(ntb_ctx->ntbdev); 3850 | +} 3851 | + 3852 | + 3853 | +/* 3854 | + * Wrapper over corresponding NTB api, added to maintain abstraction: 3855 | + * channel functionality doesn't call directly into NTB apis 3856 | + * but only via ntb-client. 3857 | + */ 3858 | +int ntb_client_db_set_mask(uint64_t db_bits) 3859 | +{ 3860 | + if ((!ntb_ctx) 3861 | + || (!ntb_ctx->ntbdev)) { 3862 | + ERR("(%s): ntb client not ready\n", __func__); 3863 | + return -EINVAL; 3864 | + } 3865 | + 3866 | + return ntb_db_set_mask(ntb_ctx->ntbdev, db_bits); 3867 | +} 3868 | + 3869 | + 3870 | +/* 3871 | + * Wrapper over corresponding NTB api, added to maintain abstraction: 3872 | + * channel functionality doesn't call directly into NTB apis 3873 | + * but only via ntb-client. 3874 | + */ 3875 | +int ntb_client_db_clear_mask(uint64_t db_bits) 3876 | +{ 3877 | + if ((!ntb_ctx) 3878 | + || (!ntb_ctx->ntbdev)) { 3879 | + ERR("(%s): ntb client not ready\n", __func__); 3880 | + return -EINVAL; 3881 | + } 3882 | + 3883 | + return ntb_db_clear_mask(ntb_ctx->ntbdev, db_bits); 3884 | +} 3885 | + 3886 | + 3887 | +/* 3888 | + * Wrapper over corresponding NTB api, added to maintain abstraction: 3889 | + * channel functionality doesn't call directly into NTB apis 3890 | + * but only via ntb-client. 3891 | + */ 3892 | +int ntb_client_db_clear(uint64_t db_bits) 3893 | +{ 3894 | + if ((!ntb_ctx) 3895 | + || (!ntb_ctx->ntbdev)) { 3896 | + ERR("(%s): ntb client not ready\n", __func__); 3897 | + return -EINVAL; 3898 | + } 3899 | + 3900 | + return ntb_db_clear(ntb_ctx->ntbdev, db_bits); 3901 | +} 3902 | + 3903 | + 3904 | +/* 3905 | + * Wrapper over corresponding NTB api, added to maintain abstraction: 3906 | + * channel functionality doesn't call directly into NTB apis 3907 | + * but only via ntb-client. 3908 | + */ 3909 | +int ntb_client_peer_db_set(uint64_t db_bits) 3910 | +{ 3911 | + if ((!ntb_ctx) 3912 | + || (!ntb_ctx->ntbdev)) { 3913 | + ERR("(%s): ntb client not ready\n", __func__); 3914 | + return -EINVAL; 3915 | + } 3916 | + 3917 | + return ntb_peer_db_set(ntb_ctx->ntbdev, db_bits); 3918 | +} 3919 | + 3920 | + 3921 | +/* 3922 | + * This function allows nvscic2c driver to set the link state as 3923 | + * UP(true) or DOWN(false) when the NTB client driver was registered properly. 3924 | + * 3925 | + * Expected use is, to set the link to TRUE when nvscic2c is done setup of 3926 | + * all c2c channels and is ready to exchange data. 3927 | + * 3928 | + * true: link up, false otherwise. 3929 | + */ 3930 | +int ntb_client_set_link_status(enum link_status status) 3931 | +{ 3932 | + if ((!ntb_ctx) 3933 | + || (!ntb_ctx->ntbdev)) { 3934 | + ERR("(%s): ntb client not ready\n", __func__); 3935 | + return -EINVAL; 3936 | + } 3937 | + 3938 | + if (status == LINK_UP) { 3939 | + ntb_link_enable(ntb_ctx->ntbdev, 3940 | + NTB_SPEED_AUTO, NTB_WIDTH_AUTO); 3941 | + } else { 3942 | + ntb_link_disable(ntb_ctx->ntbdev); 3943 | + } 3944 | + 3945 | + return 0; 3946 | +} 3947 | + 3948 | + 3949 | +/* 3950 | + * Export nvscic2c(channel-cdev.c) dev node portion of self memory: 3951 | + * Rx memory to userspace via mmap() call. Reason for this implementation 3952 | + * being, if PCIe shared mem was allocated using dma-buff apis, channel 3953 | + * abstraction would not have ntbdev to do mmap for dma buffer. Also, 3954 | + * channel doesn't if fixed address(iommu=off) is being used. 3955 | + * 3956 | + * If nvscic2c module has called link_mgmt_release(), it must not 3957 | + * refer link_status_mem anytime after that. 3958 | + */ 3959 | +int ntb_client_mmap_self_mem(struct vm_area_struct *vma, 3960 | + struct dma_buff_t *self_mem) 3961 | +{ 3962 | + /* args check.*/ 3963 | + if ((!vma) 3964 | + || (!self_mem)) { 3965 | + ERR("(%s): Function args invalid\n", __func__); 3966 | + return -EINVAL; 3967 | + } 3968 | + 3969 | + /* if called without ntb client registered.*/ 3970 | + if ((!ntb_ctx) 3971 | + || (!ntb_ctx->ntbdev)) { 3972 | + ERR("(%s): ntb client not ready\n", __func__); 3973 | + return -EINVAL; 3974 | + } 3975 | + 3976 | + /* ntb-client doesn't know the size of rx memory of channel, 3977 | + * trust channel-cdev.c to have done the mmap size validation. 3978 | + */ 3979 | + if (ntb_ctx->use_fixed_addr) { 3980 | + return remap_pfn_range(vma, vma->vm_start, 3981 | + PFN_DOWN(self_mem->dma_handle), 3982 | + self_mem->size, 3983 | + vma->vm_page_prot); 3984 | + } else { 3985 | + /* size of channel to be mapped is calculated by dma_ api 3986 | + * using vma_pages(): vma->vm_end - vma->vm_start. 3987 | + */ 3988 | + vma->vm_pgoff = ((self_mem->dma_handle 3989 | + - ntb_ctx->self_mem.dma_handle) 3990 | + >> (PAGE_SHIFT) 3991 | + ); 3992 | + return dma_mmap_coherent(&(ntb_ctx->ntbdev->pdev->dev), 3993 | + vma, 3994 | + ntb_ctx->self_mem.pva, 3995 | + ntb_ctx->self_mem.dma_handle, 3996 | + ntb_ctx->self_mem.size); 3997 | + } 3998 | + 3999 | + return -EINVAL; 4000 | +} 4001 | + 4002 | + 4003 | +/* Doorbell event/callback from NT driver. */ 4004 | +static void db_event(void *ctx, int vec) 4005 | +{ 4006 | + struct ntb_client_t *ntb_ctx = NULL; 4007 | + 4008 | + if (!ctx) 4009 | + return; 4010 | + ntb_ctx = (struct ntb_client_t *)(ctx); 4011 | + 4012 | + //DBG("DB Event Handler, DB: (%d)\n", vec); 4013 | + 4014 | + /* pass on the channel abstraction to handle event. 4015 | + * this is not clean, as we didn't register this CB 4016 | + * with ntb-client.c. This is just calling into assuming 4017 | + * channels must be up when first DB vec comes. 4018 | + */ 4019 | + channel_db_event(vec); 4020 | +} 4021 | + 4022 | + 4023 | +/* 4024 | + * NTB device driver would raise this callback whenever there is change 4025 | + * in link status(up->down, down->up). 4026 | + * 4027 | + * We expect this to come only when NTB client driver has registered with 4028 | + * NTB driver, therefore skipping all validations. 4029 | + * 4030 | + * This information would be vital to nvscic2c NTB LINK MANAGEMENT THREAD. 4031 | + */ 4032 | +static void link_event(void *ctx) 4033 | +{ 4034 | + struct ntb_client_t *ntb_ctx = NULL; 4035 | + enum ntb_speed speed = NTB_SPEED_AUTO; 4036 | + enum ntb_width width = NTB_WIDTH_AUTO; 4037 | + int up = 0; 4038 | + 4039 | + if (ctx == NULL) 4040 | + return; 4041 | + ntb_ctx = (struct ntb_client_t *)(ctx); 4042 | + 4043 | + up = ntb_link_is_up(ntb_ctx->ntbdev, &speed, &width); 4044 | + 4045 | + //DBG("link is %s speed %d width %d\n", 4046 | + // (up) ? ("up") : ("down"), speed, width); 4047 | + 4048 | + if (up) 4049 | + link_mgmt_event_cb(LINK_UP); 4050 | + else 4051 | + link_mgmt_event_cb(LINK_DOWN); 4052 | +} 4053 | + 4054 | + 4055 | +/* event callbacks registered with NTB driver. 4056 | + * for every Doorbell triggered. 4057 | + * for NTB link going up/down. 4058 | + */ 4059 | +static const struct ntb_ctx_ops client_ops = { 4060 | + .link_event = link_event, 4061 | + .db_event = db_event, 4062 | +}; 4063 | + 4064 | + 4065 | +/* 4066 | + * Private to NTB client driver. 4067 | + * 4068 | + * This function is probe for NTB client driver. 4069 | + * Invoked by NTB device driver during NTB client registration. 4070 | + * Sequence: ntb_client_register()->ntb_register_client()-> 4071 | + * ntb_client_probe(). 4072 | + * 4073 | + * Here we go over the memory window of our intereset (mw:1) and 4074 | + * allocate a physical backing for mw:1, we also map the PCIe 4075 | + * aperture for the mw:1 for tx purpose. 4076 | + * 4077 | + * Assumption is BOTH NT-EP ports have same BAR4/5 size settings. 4078 | + */ 4079 | +static int ntb_client_probe(struct ntb_client *self, 4080 | + struct ntb_dev *ntb) 4081 | +{ 4082 | + int ret = 0, mw_count = 0; 4083 | + resource_size_t win_size = 0x0; 4084 | + phys_addr_t base = 0x0; 4085 | + int db_vecs = 0, db_read_mask = 0x0; 4086 | + 4087 | + if (!ntb_ctx) { 4088 | + ret = -EINVAL; 4089 | + ERR("Invaldid ntb client ctx\n"); 4090 | + goto err; 4091 | + } 4092 | + ntb_ctx->ntbdev = ntb; 4093 | + 4094 | + /* query the available mw regions with NT port. */ 4095 | +#if LINUX_VERSION_CODE < KERNEL_VERSION(4, 14, 0) 4096 | + mw_count = ntb_mw_count(ntb_ctx->ntbdev); 4097 | +#else 4098 | + mw_count = ntb_peer_mw_count(ntb_ctx->ntbdev); 4099 | +#endif 4100 | + if (mw_count < (DATA_MW_ID + 1)) { 4101 | + ret = -ENOMEM; 4102 | + ERR("Required pcie memory window not found\n"); 4103 | + goto err_mw_count; 4104 | + } 4105 | + 4106 | + /* use WindowID:1 for sharing PCIe shared memory.*/ 4107 | + /* MW:1 can be extended to BAR size of our choice by extending the 4108 | + * max_limit in NTB driver. Even afte doing so, NTB doesn't allow 4109 | + * setting MW:0 size to be greater than 2MB. So, we use MW:1 for 4110 | + * data xfers. 4111 | + */ 4112 | +#if LINUX_VERSION_CODE < KERNEL_VERSION(4, 14, 0) 4113 | + ret = ntb_mw_get_range(ntb_ctx->ntbdev, DATA_MW_ID, 4114 | + &(base), &(win_size), NULL, NULL); 4115 | +#else 4116 | + ret = ntb_peer_mw_get_addr(ntb_ctx->ntbdev, DATA_MW_ID, 4117 | + &(base), &(win_size)); 4118 | +#endif 4119 | + if (ret) { 4120 | + ret = -ENOMEM; 4121 | + ERR("Failed to query memory window, mw:%u\n", DATA_MW_ID); 4122 | + goto err_mw_get_range; 4123 | + } 4124 | + 4125 | + /* the PCIe shared mem (or aperture) is less than requested. */ 4126 | + if (win_size < ntb_ctx->req_mw_sz) { 4127 | + ret = -ENOMEM; 4128 | + ERR("NTB PCIe memory window size less than required.\n"); 4129 | + goto err_mw_size_check; 4130 | + } 4131 | + 4132 | + /* if we have been asked to use fixed physical address. - x86_64 4133 | + * assuming we bypass swiotlb and intel iommu. 4134 | + * on x86_64 for 64MB space, we get message SWIOTLB out of space. 4135 | + * 4136 | + * Ideally, we would like each nvscic2c channel to request and 4137 | + * map their own view of this memory. 4138 | + */ 4139 | + if (ntb_ctx->use_fixed_addr) { 4140 | + /* stake claim on this memory.*/ 4141 | + if (!(request_mem_region(ntb_ctx->fixed_mw_addr, 4142 | + ntb_ctx->req_mw_sz, 4143 | + MODULE_NAME))) { 4144 | + ret = -EBUSY; 4145 | + ERR("Self Mem:(%pa[d])+(0x%08zx) already in use\n", 4146 | + &(ntb_ctx->self_mem.dma_handle), 4147 | + ntb_ctx->self_mem.size); 4148 | + goto err_self_busy; 4149 | + } 4150 | + ntb_ctx->self_mem.size = ntb_ctx->req_mw_sz; 4151 | + ntb_ctx->self_mem.dma_handle = ntb_ctx->fixed_mw_addr; 4152 | + } else { 4153 | + /* use ntb(pcie) dev to allocate memory.*/ 4154 | + ntb_ctx->self_mem.size = ntb_ctx->req_mw_sz; 4155 | + ntb_ctx->self_mem.pva = dma_zalloc_coherent( 4156 | + &(ntb_ctx->ntbdev->pdev->dev), 4157 | + ntb_ctx->self_mem.size, 4158 | + &(ntb_ctx->self_mem.dma_handle), 4159 | + GFP_KERNEL); 4160 | + if (!ntb_ctx->self_mem.pva) { 4161 | + ret = -ENOMEM; 4162 | + ERR("Window:(%u) of sz:(%08zx) alloc failed\n", 4163 | + DATA_MW_ID, ntb_ctx->self_mem.size); 4164 | + goto err_alloc; 4165 | + } 4166 | + } 4167 | + 4168 | + /* the dma_handle must be aligned to size. NTB requirement.*/ 4169 | + if (!IS_ALIGNED(ntb_ctx->self_mem.dma_handle, ntb_ctx->self_mem.size)) { 4170 | + ret = -ENOMEM; 4171 | + ERR("iova:(%pa[d]) for mw not aligned to size:(0x%zx)\n", 4172 | + &(ntb_ctx->self_mem.dma_handle), 4173 | + ntb_ctx->self_mem.size); 4174 | + goto err_align; 4175 | + } 4176 | + 4177 | + /* set up the actual translations so that peer's rd/wr fall here. */ 4178 | +#if LINUX_VERSION_CODE < KERNEL_VERSION(4, 14, 0) 4179 | + ret = ntb_mw_set_trans(ntb_ctx->ntbdev, DATA_MW_ID, 4180 | + ntb_ctx->self_mem.dma_handle, 4181 | + ntb_ctx->self_mem.size); 4182 | +#else 4183 | + ret = ntb_mw_set_trans(ntb_ctx->ntbdev, NTB_DEF_PEER_IDX, DATA_MW_ID, 4184 | + ntb_ctx->self_mem.dma_handle, 4185 | + ntb_ctx->self_mem.size); 4186 | +#endif 4187 | + if (ret != 0) { 4188 | + ERR("Failed to map translation for mw:(%u)\n", 4189 | + DATA_MW_ID); 4190 | + goto err_set_trans; 4191 | + } 4192 | + 4193 | + 4194 | + /* Map peer memory for Tx xfers. LKM doesn't need this 4195 | + * as per current design, all writes - Channel state mgmt, 4196 | + * data xfers go from user-space SW. Nevertheless, we still do it. 4197 | + */ 4198 | + ntb_ctx->peer_mem.size = ntb_ctx->req_mw_sz; 4199 | + ntb_ctx->peer_mem.aper = base; 4200 | + 4201 | + /* stake claim on peer memory - Only that much as we require.*/ 4202 | + if (!(request_mem_region(ntb_ctx->peer_mem.aper, 4203 | + ntb_ctx->peer_mem.size, 4204 | + MODULE_NAME))) { 4205 | + ret = -EBUSY; 4206 | + ERR("Peer Mem:(%pa[p])+(0x%08zx) already in use\n", 4207 | + &(ntb_ctx->peer_mem.aper), ntb_ctx->peer_mem.size); 4208 | + goto err_peer_busy; 4209 | + } 4210 | + 4211 | + /* While starting, we disable all DBs. We will enable them as and when 4212 | + * channel requires. 4213 | + */ 4214 | + ntb_ctx->db_valid_mask = ntb_db_valid_mask(ntb_ctx->ntbdev); 4215 | + ret = ntb_db_set_mask(ntb_ctx->ntbdev, ntb_ctx->db_valid_mask); 4216 | + if (ret) { 4217 | + ERR("Failed to mask all DB(s) events using:(0x%08X)\n", 4218 | + ntb_ctx->db_valid_mask); 4219 | + goto err_db; 4220 | + } 4221 | + 4222 | + /* check if the NTB device supports as many DB vectors as 4223 | + * supported DB's - Our nvscic2c design is based on 1 vector 4224 | + * per DB. This check is to be done once we set our interest of DBs 4225 | + * with NTB module.db vecs:[0, (supported - 1)] 4226 | + */ 4227 | + db_vecs = ntb_db_vector_count(ntb_ctx->ntbdev); 4228 | + db_read_mask = ntb_db_read_mask(ntb_ctx->ntbdev); 4229 | + if (db_vecs < fls(db_read_mask)) { 4230 | + ERR("Supported DB Vectors:(%d) not enough for db:(0x%08X)\n", 4231 | + db_vecs, db_read_mask); 4232 | + ERR("Require 1 DB MSI vector per DB\n"); 4233 | + goto err_db_vec; 4234 | + } 4235 | + 4236 | + /* register local ntb_client context with NTB driver. */ 4237 | + ret = ntb_set_ctx(ntb_ctx->ntbdev, ntb_ctx, &(client_ops)); 4238 | + if (ret) 4239 | + goto err_set_ctx; 4240 | + 4241 | + /* signal probe completed successfully. */ 4242 | + complete(&(ntb_ctx->probe_cpl)); 4243 | + 4244 | + /* NTB link will be enabled by nvscic2c driver.*/ 4245 | + return ret; 4246 | + 4247 | +err_set_ctx: 4248 | +err_db_vec: 4249 | + ntb_db_set_mask(ntb_ctx->ntbdev, ntb_ctx->db_valid_mask); 4250 | + 4251 | +err_db: 4252 | + release_mem_region(ntb_ctx->peer_mem.aper, 4253 | + ntb_ctx->peer_mem.size); 4254 | + 4255 | +err_peer_busy: 4256 | +#if LINUX_VERSION_CODE < KERNEL_VERSION(4, 14, 0) 4257 | + ntb_mw_clear_trans(ntb_ctx->ntbdev, DATA_MW_ID); 4258 | +#else 4259 | + ntb_mw_set_trans(ntb_ctx->ntbdev, NTB_DEF_PEER_IDX, 4260 | + DATA_MW_ID, 0x0, 0x0); 4261 | +#endif 4262 | + 4263 | +err_set_trans: 4264 | +err_align: 4265 | + if ((ntb_ctx->use_fixed_addr) 4266 | + && (ntb_ctx->self_mem.dma_handle != 0)) { 4267 | + release_mem_region(ntb_ctx->self_mem.dma_handle, 4268 | + ntb_ctx->self_mem.size); 4269 | + ntb_ctx->self_mem.dma_handle = 0; 4270 | + } else if (ntb_ctx->self_mem.pva) { 4271 | + dma_free_coherent(&(ntb_ctx->ntbdev->pdev->dev), 4272 | + ntb_ctx->self_mem.size, 4273 | + ntb_ctx->self_mem.pva, 4274 | + ntb_ctx->self_mem.dma_handle); 4275 | + ntb_ctx->self_mem.pva = NULL; 4276 | + } 4277 | + 4278 | +err_alloc: 4279 | +err_self_busy: 4280 | +err_mw_size_check: 4281 | +err_mw_get_range: 4282 | +err_mw_count: 4283 | +err: 4284 | + ntb_ctx->ntbdev = NULL; 4285 | + 4286 | + return ret; 4287 | +} 4288 | + 4289 | + 4290 | +/* 4291 | + * Private to NTB client driver. 4292 | + * 4293 | + * This function does the reverse of probe() for NTB client driver. 4294 | + * Sequence: ntb_client_unregister()->ntb_unregister_client()-> 4295 | + * ntb_client_remove(). 4296 | + */ 4297 | +static void ntb_client_remove(struct ntb_client *self, 4298 | + struct ntb_dev *ntb) 4299 | +{ 4300 | + if ((!ntb) 4301 | + || (ntb->ctx != ntb_ctx) 4302 | + || (!ntb_ctx->ntbdev)) { 4303 | + return; 4304 | + } 4305 | + 4306 | + /* clear the NTB client setup.*/ 4307 | + ntb_db_set_mask(ntb_ctx->ntbdev, ntb_ctx->db_valid_mask); 4308 | +#if LINUX_VERSION_CODE < KERNEL_VERSION(4, 14, 0) 4309 | + ntb_mw_clear_trans(ntb_ctx->ntbdev, DATA_MW_ID); 4310 | +#else 4311 | + ntb_mw_set_trans(ntb_ctx->ntbdev, NTB_DEF_PEER_IDX, 4312 | + DATA_MW_ID, 0x0, 0x0); 4313 | +#endif 4314 | + ntb_clear_ctx(ntb_ctx->ntbdev); 4315 | + 4316 | + if (ntb_ctx->peer_mem.aper != 0) { 4317 | + release_mem_region(ntb_ctx->peer_mem.aper, 4318 | + ntb_ctx->peer_mem.size); 4319 | + ntb_ctx->peer_mem.aper = 0; 4320 | + } 4321 | + 4322 | + if ((ntb_ctx->use_fixed_addr) 4323 | + && (ntb_ctx->self_mem.dma_handle != 0)) { 4324 | + release_mem_region(ntb_ctx->self_mem.dma_handle, 4325 | + ntb_ctx->self_mem.size); 4326 | + ntb_ctx->self_mem.dma_handle = 0; 4327 | + } else if (ntb_ctx->self_mem.pva) { 4328 | + dma_free_coherent(&(ntb_ctx->ntbdev->pdev->dev), 4329 | + ntb_ctx->self_mem.size, 4330 | + ntb_ctx->self_mem.pva, 4331 | + ntb_ctx->self_mem.dma_handle); 4332 | + ntb_ctx->self_mem.pva = NULL; 4333 | + } 4334 | + 4335 | + ntb_ctx->ntbdev = NULL; 4336 | +} 4337 | + 4338 | + 4339 | +/* for ntb client driver registration. */ 4340 | +static struct ntb_client c2c_ntb_client = { 4341 | + .ops = { 4342 | + .probe = ntb_client_probe, 4343 | + .remove = ntb_client_remove, 4344 | + }, 4345 | +}; 4346 | + 4347 | + 4348 | +/* 4349 | + * Interface for nvscic2c driver to register itself as a NTB client driver. 4350 | + * 4351 | + * Because, we expect nvscic2c window size requirements to be fulfilled 4352 | + * by PCIe NT share memory, this function should be called after successful 4353 | + * parsing of DT node of nvscic2c. 4354 | + * 4355 | + * Not thread-safe. 4356 | + */ 4357 | +int ntb_client_register(struct c2c_drv_ctx_t *drv_ctx) 4358 | +{ 4359 | + int ret = 0; 4360 | + 4361 | + /* validation. */ 4362 | + if ((!drv_ctx) 4363 | + || (!drv_ctx->c2c_param.req_mw_sz)) { 4364 | + ret = -EINVAL; 4365 | + ERR("(%s): Invalid Params\n", __func__); 4366 | + goto err; 4367 | + } 4368 | + 4369 | + /* should not be an exiting ntb client driver context already. */ 4370 | + if (ntb_ctx) { 4371 | + ret = -EINVAL; 4372 | + ERR("NTB client instantiated already\n"); 4373 | + goto err; 4374 | + } 4375 | + 4376 | + /* ntb client driver doesn't allow any context to be passed 4377 | + * to it while registration, it expects ntb client to allocate 4378 | + * context during it's own probe. Hence we allocate global ntb 4379 | + * context here and tag nvscic2c device to it. 4380 | + */ 4381 | + ntb_ctx = kzalloc(sizeof(struct ntb_client_t), GFP_KERNEL); 4382 | + if (!ntb_ctx) { 4383 | + ret = -ENOMEM; 4384 | + ERR("Failed allocating ntb ctx.\n"); 4385 | + goto err; 4386 | + } 4387 | + ntb_ctx->req_mw_sz = drv_ctx->c2c_param.req_mw_sz; 4388 | + ntb_ctx->fixed_mw_addr = drv_ctx->c2c_param.fixed_mw_addr; 4389 | + ntb_ctx->fixed_mw_sz = drv_ctx->c2c_param.fixed_mw_sz; 4390 | + ntb_ctx->use_fixed_addr = drv_ctx->c2c_param.use_fixed_addr; 4391 | + 4392 | + /* we wait on this for probe to complete. Initialise here.*/ 4393 | + init_completion(&(ntb_ctx->probe_cpl)); 4394 | + 4395 | + /* register for ntb client driver now. */ 4396 | + ret = ntb_register_client(&(c2c_ntb_client)); 4397 | + if (ret) { 4398 | + ERR("Failed to register as NTB client\n"); 4399 | + goto err; 4400 | + } 4401 | + 4402 | + /* wait for PCIe NTB client async probe to complete because 4403 | + * nvscic2c driver will proceed ahead assuming probe() finished 4404 | + * if still not done. 2sec - sufficiently large. 4405 | + */ 4406 | + if (wait_for_completion_interruptible_timeout(&(ntb_ctx->probe_cpl), 4407 | + msecs_to_jiffies(2000)) <= 0) { 4408 | + ret = -ETIMEDOUT; 4409 | + ERR("NTB client probe took long time, failed probably\n"); 4410 | + goto err; 4411 | + } 4412 | + 4413 | + /* all okay. */ 4414 | + return ret; 4415 | + 4416 | +err: 4417 | + ntb_client_unregister(drv_ctx); 4418 | + return ret; 4419 | +} 4420 | + 4421 | + 4422 | +/* 4423 | + * Interface for nvscic2c driver to unload the NTB client driver. 4424 | + * 4425 | + * As a result the NTB link between two SoC's would go DOWN. 4426 | + * 4427 | + * Not thread-safe. 4428 | + */ 4429 | +void ntb_client_unregister(struct c2c_drv_ctx_t *drv_ctx) 4430 | +{ 4431 | + /* validation. */ 4432 | + if ((!drv_ctx) 4433 | + || (!ntb_ctx)) { 4434 | + return; 4435 | + } 4436 | + 4437 | + ntb_unregister_client(&(c2c_ntb_client)); 4438 | + 4439 | + kfree(ntb_ctx); 4440 | + ntb_ctx = NULL; 4441 | +} 4442 | + 4443 | + 4444 | +/* 4445 | + * NTB client driver has a private context, query the pcie shared memory 4446 | + * with the caller 4447 | + * 4448 | + * Can be called once NTB client is registered with NTB properly. 4449 | + */ 4450 | +int ntb_client_query_mem_info(struct dma_buff_t *self_mem, 4451 | + struct pci_mmio_t *peer_mem) 4452 | +{ 4453 | + int ret = 0; 4454 | + 4455 | + if ((!ntb_ctx) 4456 | + || (!ntb_ctx->ntbdev)) { 4457 | + ret = -EINVAL; 4458 | + pr_err("(%s): ntb client not ready\n", __func__); 4459 | + goto err; 4460 | + } 4461 | + 4462 | + if ((!self_mem) 4463 | + || (!peer_mem)) { 4464 | + ret = -EINVAL; 4465 | + pr_info("(%s): Invalid Params\n", __func__); 4466 | + goto err; 4467 | + } 4468 | + 4469 | + memcpy(self_mem, &(ntb_ctx->self_mem), sizeof(ntb_ctx->self_mem)); 4470 | + memcpy(peer_mem, &(ntb_ctx->peer_mem), sizeof(ntb_ctx->peer_mem)); 4471 | + 4472 | + return ret; 4473 | +err: 4474 | + return ret; 4475 | +} 4476 | diff --git a/include/linux/nvscic2c-ioctl.h b/include/linux/nvscic2c-ioctl.h 4477 | new file mode 100644 4478 | index 0000000..0d7bcdd 4479 | --- /dev/null 4480 | +++ b/include/linux/nvscic2c-ioctl.h 4481 | @@ -0,0 +1,152 @@ 4482 | +/* 4483 | + * Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved. 4484 | + * 4485 | + * This program is free software; you can redistribute it and/or modify it 4486 | + * under the terms and conditions of the GNU General Public License, 4487 | + * version 2, as published by the Free Software Foundation. 4488 | + * 4489 | + * This program is distributed in the hope it will be useful, but WITHOUT 4490 | + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 4491 | + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for 4492 | + * more details. 4493 | + */ 4494 | + 4495 | +/* 4496 | + * Interface for user-space sw to issue ioctl commands on the 4497 | + * nvscic2c channel devices. 4498 | + */ 4499 | +#ifndef __NVSCIC2C_IOCTL_H__ 4500 | +#define __NVSCIC2C_IOCTL_H__ 4501 | + 4502 | +#include 4503 | +#include 4504 | + 4505 | +#if !defined(__KERNEL__) 4506 | +#define __user 4507 | +#endif 4508 | + 4509 | + 4510 | +/* IOCTL magic number - seen available in ioctl-number.txt*/ 4511 | +#define NVSCIC2C_IOCTL_MAGIC 0xC2 4512 | + 4513 | + 4514 | +/* 4515 | + * User-space will not see the internal EVENT IDs. 4516 | + * It will be masked behind these types. Data passed on by user-space in 4517 | + * response to NOTIFY_REMOTE. 4518 | + */ 4519 | + 4520 | +/* send ioctl to trigger remote data channel.*/ 4521 | +#define NVSCIC2C_NOTIFY_PRODUCER (1 << 0) 4522 | + 4523 | +/* send ioctl to trigger remote data channel.*/ 4524 | +#define NVSCIC2C_NOTIFY_CONSUMER (1 << 1) 4525 | + 4526 | +/* send ioctl to trigger remote state mgmt.*/ 4527 | +#define NVSCIC2C_NOTIFY_STATE (1 << 2) 4528 | + 4529 | + 4530 | +/* Channel Device link status.*/ 4531 | +enum link_status { 4532 | + LINK_DOWN = 0, 4533 | + LINK_UP, 4534 | +}; 4535 | + 4536 | + 4537 | +/* 4538 | + * bulk data transfer channels could be uni-directional. If there is no 4539 | + * use-case for bi-directional data xfer but we still create a full-duplex 4540 | + * single nvscic2c bulk channel, we end up leaving lot of PCIe shared memory 4541 | + * being un-utilised. 4542 | + */ 4543 | +enum bulk_xfer_type { 4544 | + /* it's not a bulk data xfer device but plain CPU channel. 4545 | + * data dir: Self<->Peer. (default, do not change this value). 4546 | + */ 4547 | + BULK_XFER_TYPE_NONE = 0, 4548 | + 4549 | + /* This device supports only bulk transfer, 4550 | + * data dir: Peer->Self. We do write over PCIe for data. 4551 | + */ 4552 | + BULK_XFER_TYPE_PRODUCER, 4553 | + 4554 | + /* This device supports only bulk transfer, 4555 | + * data dir: Self->Peer. We do write over PCIe for data. 4556 | + */ 4557 | + BULK_XFER_TYPE_CONSUMER, 4558 | + 4559 | + /* This device supports only bulk transfer, 4560 | + * data dir: Peer->Self but we use Self capability to read 4561 | + * data over PCIe(typically DMA). 4562 | + */ 4563 | + BULK_XFER_TYPE_PRODUCER_PCIE_READ, 4564 | + 4565 | + /* This device supports only bulk transfer, 4566 | + * data dir: Self->Peer but we use peer capability to read 4567 | + * data over PCIe(typically DMA). 4568 | + */ 4569 | + BULK_XFER_TYPE_CONSUMER_PCIE_READ, 4570 | + 4571 | + /* Invalid.*/ 4572 | + BULK_XFER_TYPE_INVALID 4573 | +}; 4574 | + 4575 | + 4576 | +/* 4577 | + * memory segment available to be mapped by user. This will 4578 | + * typically be returned from the device to user. 4579 | + */ 4580 | +struct mem_map { 4581 | + /* would be one of the enum nvscic2c_mem_type.*/ 4582 | + uint32_t offset; 4583 | + 4584 | + /* size of this memory type device would like user-space to map.*/ 4585 | + uint32_t size; 4586 | +}; 4587 | + 4588 | + 4589 | +/* Data passed on user-space in response to GET_INFO.*/ 4590 | +struct nvscic2c_info { 4591 | + /* device peer and self memory fragmented in these many frames.*/ 4592 | + int8_t nframes; 4593 | + 4594 | + /* device peer and self memory fragmented frames of these size.*/ 4595 | + uint32_t frame_sz; 4596 | + 4597 | + /* This device supports only CPU transfer(small data) or Bulk data 4598 | + * xfer. If Bulk data, then is data going outside this device 4599 | + * or coming into this device. 4600 | + */ 4601 | + enum bulk_xfer_type xfer_type; 4602 | + 4603 | + /* peer memory info.*/ 4604 | + struct mem_map peer; 4605 | + 4606 | + /* self memory info. */ 4607 | + struct mem_map self; 4608 | + 4609 | + /* control memory info.*/ 4610 | + struct mem_map ctrl; 4611 | + 4612 | + /* NTB link memory info.*/ 4613 | + struct mem_map link; 4614 | + 4615 | + /* platform: x86_64 or tegra - DEBUG ONLY.*/ 4616 | + char platform[24]; 4617 | +}; 4618 | + 4619 | +/* IOCTL definitions */ 4620 | + 4621 | +/* query nvscic2c device properties for mapping it's memory in user-space.*/ 4622 | +#define NVSCIC2C_IOCTL_GET_INFO \ 4623 | + _IOR(NVSCIC2C_IOCTL_MAGIC, 1, struct nvscic2c_info) 4624 | + 4625 | +/* notify remote(data or state) 4626 | + * parameter must be one or a combination of NVSCIC2C_NOTIFY_*: 4627 | + */ 4628 | +#define NVSCIC2C_IOCTL_NOTIFY_REMOTE \ 4629 | + _IOW(NVSCIC2C_IOCTL_MAGIC, 2, uint8_t) 4630 | + 4631 | +#define NVSCIC2C_IOCTL_NUMBER_MAX 2 4632 | + 4633 | +#endif //__NVSCIC2C_IOCTL_H__ 4634 | --------------------------------------------------------------------------------