130 lines
4.9 KiB
Plaintext
130 lines
4.9 KiB
Plaintext
|
SYSFS FILES
|
||
|
|
||
|
For each InfiniBand device, the InfiniBand drivers create the
|
||
|
following files under /sys/class/infiniband/<device name>:
|
||
|
|
||
|
node_type - Node type (CA, switch or router)
|
||
|
node_guid - Node GUID
|
||
|
sys_image_guid - System image GUID
|
||
|
|
||
|
In addition, there is a "ports" subdirectory, with one subdirectory
|
||
|
for each port. For example, if mthca0 is a 2-port HCA, there will
|
||
|
be two directories:
|
||
|
|
||
|
/sys/class/infiniband/mthca0/ports/1
|
||
|
/sys/class/infiniband/mthca0/ports/2
|
||
|
|
||
|
(A switch will only have a single "0" subdirectory for switch port
|
||
|
0; no subdirectory is created for normal switch ports)
|
||
|
|
||
|
In each port subdirectory, the following files are created:
|
||
|
|
||
|
cap_mask - Port capability mask
|
||
|
lid - Port LID
|
||
|
lid_mask_count - Port LID mask count
|
||
|
rate - Port data rate (active width * active speed)
|
||
|
sm_lid - Subnet manager LID for port's subnet
|
||
|
sm_sl - Subnet manager SL for port's subnet
|
||
|
state - Port state (DOWN, INIT, ARMED, ACTIVE or ACTIVE_DEFER)
|
||
|
phys_state - Port physical state (Sleep, Polling, LinkUp, etc)
|
||
|
|
||
|
There is also a "counters" subdirectory, with files
|
||
|
|
||
|
VL15_dropped
|
||
|
excessive_buffer_overrun_errors
|
||
|
link_downed
|
||
|
link_error_recovery
|
||
|
local_link_integrity_errors
|
||
|
port_rcv_constraint_errors
|
||
|
port_rcv_data
|
||
|
port_rcv_errors
|
||
|
port_rcv_packets
|
||
|
port_rcv_remote_physical_errors
|
||
|
port_rcv_switch_relay_errors
|
||
|
port_xmit_constraint_errors
|
||
|
port_xmit_data
|
||
|
port_xmit_discards
|
||
|
port_xmit_packets
|
||
|
symbol_error
|
||
|
|
||
|
Each of these files contains the corresponding value from the port's
|
||
|
Performance Management PortCounters attribute, as described in
|
||
|
section 16.1.3.5 of the InfiniBand Architecture Specification.
|
||
|
|
||
|
The "pkeys" and "gids" subdirectories contain one file for each
|
||
|
entry in the port's P_Key or GID table respectively. For example,
|
||
|
ports/1/pkeys/10 contains the value at index 10 in port 1's P_Key
|
||
|
table.
|
||
|
|
||
|
There is an optional "hw_counters" subdirectory that may be under either
|
||
|
the parent device or the port subdirectories or both. If present,
|
||
|
there are a list of counters provided by the hardware. They may match
|
||
|
some of the counters in the counters directory, but they often include
|
||
|
many other counters. In addition to the various counters, there will
|
||
|
be a file named "lifespan" that configures how frequently the core
|
||
|
should update the counters when they are being accessed (counters are
|
||
|
not updated if they are not being accessed). The lifespan is in milli-
|
||
|
seconds and defaults to 10 unless set to something else by the driver.
|
||
|
Users may echo a value between 0 - 10000 to the lifespan file to set
|
||
|
the length of time between updates in milliseconds.
|
||
|
|
||
|
MTHCA
|
||
|
|
||
|
The Mellanox HCA driver also creates the files:
|
||
|
|
||
|
hw_rev - Hardware revision number
|
||
|
fw_ver - Firmware version
|
||
|
hca_type - HCA type: "MT23108", "MT25208 (MT23108 compat mode)",
|
||
|
or "MT25208"
|
||
|
|
||
|
HFI1
|
||
|
|
||
|
The hfi1 driver also creates these additional files:
|
||
|
|
||
|
hw_rev - hardware revision
|
||
|
board_id - manufacturing board id
|
||
|
tempsense - thermal sense information
|
||
|
serial - board serial number
|
||
|
nfreectxts - number of free user contexts
|
||
|
nctxts - number of allowed contexts (PSM2)
|
||
|
chip_reset - diagnostic (root only)
|
||
|
boardversion - board version
|
||
|
|
||
|
sdma<N>/ - one directory per sdma engine (0 - 15)
|
||
|
sdma<N>/cpu_list - read-write, list of cpus for user-process to sdma
|
||
|
engine assignment.
|
||
|
sdma<N>/vl - read-only, vl the sdma engine maps to.
|
||
|
|
||
|
The new interface will give the user control on the affinity settings
|
||
|
for the hfi1 device.
|
||
|
As an example, to set an sdma engine irq affinity and thread affinity
|
||
|
of a user processes to use the sdma engine, which is "near" in terms
|
||
|
of NUMA configuration, or physical cpu location, the user will do:
|
||
|
|
||
|
echo "3" > /proc/irq/<N>/smp_affinity_list
|
||
|
echo "4-7" > /sys/devices/.../sdma3/cpu_list
|
||
|
cat /sys/devices/.../sdma3/vl
|
||
|
0
|
||
|
echo "8" > /proc/irq/<M>/smp_affinity_list
|
||
|
echo "9-12" > /sys/devices/.../sdma4/cpu_list
|
||
|
cat /sys/devices/.../sdma4/vl
|
||
|
1
|
||
|
|
||
|
to make sure that when a process runs on cpus 4,5,6, or 7,
|
||
|
and uses vl=0, then sdma engine 3 is selected by the driver,
|
||
|
and also the interrupt of the sdma engine 3 is steered to cpu 3.
|
||
|
Similarly, when a process runs on cpus 9,10,11, or 12 and sets vl=1,
|
||
|
then engine 4 will be selected and the irq of the sdma engine 4 is
|
||
|
steered to cpu 8.
|
||
|
This assumes that in the above N is the irq number of "sdma3",
|
||
|
and M is irq number of "sdma4" in the /proc/interrupts file.
|
||
|
|
||
|
ports/1/
|
||
|
CCMgtA/
|
||
|
cc_settings_bin - CCA tables used by PSM2
|
||
|
cc_table_bin
|
||
|
cc_prescan - enable prescaning for faster BECN response
|
||
|
sc2v/ - 32 files (0 - 31) used to translate sl->vl
|
||
|
sl2sc/ - 32 files (0 - 31) used to translate sl->sc
|
||
|
vl2mtu/ - 16 (0 - 15) files used to determine MTU for vl
|