Errors streaming high volumes of data over AXI interface
Bug Summary
We are attempting to debug an OpenCPI application running on an Ettus E310. We have a HDL worker producing IQ samples that would normally be fed to the E310 DAC, however for debug purposes we are trying to stream back up over AXI bus to an RCC file_write worker for capture and further analysis. When we run the application, we are seeing a range of different error scenarios and in some situations platform instability (SSH session disconnects).
To attempt to debug, we have recreated the issue in a simple application, using standard OpenCPI components. The application structure is as follows:
__________________
| |
file_read --------|----> bias -----|-------> file_write
| (hdl) |
|________________|
See section below for some of the error messages obtained when running the application.
Additional information
- We are running the E310 in network mode (i.e. reading application, artifacts and input file from development host via NFS)
- We have tried both NFS and SD card locations for the file_write output file location - both exhibit similar errors
Steps to reproduce
Simple HDL assembly and application used to recreate the issue shown below:
HDL Assembly
<HdlAssembly>
<Connection Name="dataIn" External="consumer">
<Port Instance="bias_inst" Name="in"/>
</Connection>
<Instance Worker="bias_vhdl" Name="bias_inst"/>
<Connection Name="dataOut" External="producer">
<Port Instance="bias_inst" Name="out"/>
</Connection>
</HdlAssembly>
Application XML File
<Application>
<Instance Component="ocpi.core.file_read" name="file_read" connect="bias">
<property name="fileName" value="input.dat"/>
<property name="messageSize" value="512"/>
</Instance>
<Instance Component="ocpi.core.bias" name="bias" connect="file_write">
<property name="biasValue" value="1"/>
</Instance>
<Instance Component="ocpi.core.file_write" name="file_write">
<!-- <property name="fileName" value="/run/media/mmcblk0p1/output.dat"/> -->
<property name="fileName" value="output.dat"/>
</Instance>
</Application>
Instructions
- Build assembly and prepare the E310 platform
- Create an input data file (file contents are not important)
- Attempt to run the application (using
ocpirun
via SSH session to Ettus)
Relevant logs and/or screenshots
Below are some of the error messages produced when application is run (captured with OCPI_LOG_LEVEL = 8 or 10)
OCPI( 2:794.0633): Message was truncated to 512 bytes
OCPI( 2:794.0633): Message was truncated to 512 bytes
OCPI( 2:794.0633): Message was truncated to 512 bytes
OCPI( 2:794.0634): Message was truncated to 512 bytes
OCPI( 2:794.0634): Message was truncated to 512 bytes
OCPI( 2:794.0634): Message was truncated to 512 bytes
OCPI( 2:794.0635): Message was truncated to 512 bytes
OCPI( 2:794.0635): Message was truncated to 512 bytes
OCPI( 2:794.0635): Message was truncated to 512 bytes
OCPI( 2:794.0636): Message was truncated to 512 bytes
OCPI( 2:794.0636): Message was truncated to 512 bytes
OCPI( 2:794.0636): Message was truncated to 512 bytes
OCPI( 2:794.0636): Message was truncated to 512 bytes
OCPI( 2:794.0637): Message was truncated to 512 bytes
OCPI( 2:794.0637): Message was truncated to 512 bytes
OCPI( 2:794.0637): Message was truncated to 512 bytes
OCPI( 8:824.0329): Error Exception: Worker file_write produced error during execution: error writing data to file: length 512(200): Input/output error (-1)
Container "rcc0" background thread exception: Worker file_write produced error during execution: error writing data to file: length 512(200): Input/output error (-1)
Aborted
free(): invalid pointer
Connection to 192.168.1.150 closed.
OCPI( 8:907.0269): Starting proxy workers that are not slaves.
OCPI( 8:907.0269): Starting proxy workers that are also slaves, but not sources.
OCPI( 8:907.0269): Starting proxy workers that are also slaves and are sources.
OCPI( 8:907.0269): Starting workers that are not proxies and not sources.
OCPI( 8:907.0277): Worker "start" control operation succeeded, now in state "OPERATING": worker "file_write" in container rcc0 from artifact worker file_writercc
OCPI( 8:907.0277): Starting container rcc0(1): 0x647768
OCPI( 8:907.0277): HDL Control Op Succeeded: worker bias_vhdl:a/bias_vhdl(2) op start(1)
OCPI( 8:907.0277): Worker "start" control operation succeeded, now in state "OPERATING": worker "bias_vhdl" in container PL:0 from artifact worker bias_vhdlhdl/a/bias_vhdl
OCPI( 8:907.0278): Starting workers that are sources.
OCPI( 8:907.0285): Worker "start" control operation succeeded, now in state "OPERATING": worker "file_read" in container rcc0 from artifact worker file_readrcc
OCPI( 8:950.0654): Error Exception: Worker file_write produced error during execution: error writing data to file: length 72340(11a94): Success (65024)
Container "rcc0" background thread exception: Worker file_write produced error during execution: error writing data to file: length 72340(11a94): Success (65024)
OCPI( 8: 36.0263): Using 2 containers to support the application
OCPI( 8: 36.0264): Starting proxy workers that are not slaves.
OCPI( 8: 36.0264): Starting proxy workers that are also slaves, but not sources.
OCPI( 8: 36.0264): Starting proxy workers that are also slaves and are sources.
OCPI( 8: 36.0264): Starting workers that are not proxies and not sources.
OCPI( 8: 36.0351): Worker "start" control operation succeeded, now in state "OPERATING": worker "file_write" in container rcc0 from artifact worker file_writercc
OCPI( 8: 36.0351): Starting container rcc0(1): 0x647768
OCPI( 8: 36.0352): HDL Control Op Succeeded: worker bias_vhdl:a/bias_vhdl(2) op start(1)
OCPI( 8: 36.0352): Worker "start" control operation succeeded, now in state "OPERATING": worker "bias_vhdl" in container PL:0 from artifact worker bias_vhdlhdl/a/bias_vhdl
OCPI( 8: 36.0352): Starting workers that are sources.
OCPI( 8: 36.0353): Worker "start" control operation succeeded, now in state "OPERATING": worker "file_read" in container rcc0 from artifact worker file_readrcc
Assertion failed: id => m_ports.size() is false at ../gen/os/include/OsAssert.hh:115.
Stack Dump:
ocpirun(_ZN4OCPI2OS9dumpStackERSo+0x20) [0x271f78]
ocpirun() [0x272c04]
ocpirun(_ZN4OCPI2OS15assertionFailedEPKcS2_j+0x5c) [0x272c98]
ocpirun(_Z9ocpiAbortPKc+0x24) [0x1daaec]
ocpirun(_ZN4OCPI9Transport7PortSet18getPortFromOrdinalEj+0x50) [0x1db3d4]
ocpirun(_ZN4OCPI9Transport6Buffer7isEmptyEv+0x308) [0x1deb44]
ocpirun(_ZN4OCPI9Transport12OutputBuffer7isEmptyEv+0x74) [0x1dd94c]
ocpirun(_ZN4OCPI9Transport10Controller24getNextEmptyOutputBufferEPNS0_4PortE+0x58) [0x1ef1f4]
ocpirun(_ZN4OCPI9Transport4Port24getNextEmptyOutputBufferEv+0xe8) [0x1d5474]
ocpirun(_ZN4OCPI9Transport4Port24getNextEmptyOutputBufferERPhRj+0x20) [0x1d52b4]
ocpirun(_ZN4OCPI9Container9BasicPort14getEmptyBufferEv+0x1f4) [0x1715d0]
ocpirun(_ZN4OCPI9Container9BasicPort9getBufferERPhRj+0x10c) [0x171744]
/mnt/net/exports/xilinx19_2_aarch32/lib/libocpi_rcc_s.so(_ZN4OCPI3RCC4Port10requestRccEj+0x88) [0xb6c7768c]
/mnt/net/exports/xilinx19_2_aarch32/lib/libocpi_rcc_s.so(_ZN4OCPI3RCC4Port10advanceRccEj+0x160) [0xb6c7b834]
/mnt/net/exports/xilinx19_2_aarch32/lib/libocpi_rcc_s.so(+0x2c854) [0xb6c6c854]
/mnt/net/exports/xilinx19_2_aarch32/lib/libocpi_rcc_s.so(+0x2c7b8) [0xb6c6c7b8]
/mnt/net/exports/xilinx19_2_aarch32/lib/libocpi_rcc_s.so(_ZN4OCPI3RCC6Worker10advanceAllEv+0x68) [0xb6c6f164]
/mnt/net/exports/xilinx19_2_aarch32/lib/libocpi_rcc_s.so(_ZN4OCPI3RCC6Worker3runERb+0x7f0) [0xb6c6ed1c]
/mnt/net/exports/xilinx19_2_aarch32/lib/libocpi_rcc_s.so(_ZN4OCPI3RCC11Application3runEPNS_4Xfer12EventManagerERb+0x74) [0xb6c66898]
/mnt/net/exports/xilinx19_2_aarch32/lib/libocpi_rcc_s.so(_ZN4OCPI3RCC9Container8dispatchEPNS_4Xfer12EventManagerE+0xf8) [0xb6c68e2c]
ocpirun(_ZN4OCPI9Container9Container11runInternalEj+0x128) [0x176510]
ocpirun(_ZN4OCPI9Container9Container6threadEv+0x2c) [0x176650]
ocpirun(_ZN4OCPI9Container12runContainerEPv+0xd4) [0x1767c8]
ocpirun() [0x276d64]
Environment info
- OpenCPI Version used:
2.4.3
(but have also tested with2.4.6
) - Environment vars (
env | grep -i ocpi | sort
):
OCPI_CDK_DIR=/mnt/net/cdk
OCPI_DEFAULT_HDL_DEVICE=pl:0
OCPI_ENABLE_HDL_SIMULATOR_DISCOVERY=0
OCPI_HDL_PLATFORM=
OCPI_LIBRARY_PATH=/mnt/net/project-registry/ocpi.core/exports/artifacts:/mnt/net/cdk/xilinx19_2_aarch32/artifacts:/mnt/net/projects/assets/artifacts:/run/media/mmcblk0p1/opencpi/xilinx19_2_aarch32/artifacts:/mnt/net/project-registry/local.bias-ocpi/artifacts:/mnt/net/project-registry/ocpi.osp.e3xx/artifacts:/mnt/net/project-registry/ocpi.assets/artifacts:/mnt/net/project-registry/ocpi.platform/artifacts
OCPI_LOCAL_DIR=/run/media/mmcblk0p1/opencpi
OCPI_LOG_LEVEL=8
OCPI_NET_DIR=/mnt/net
OCPI_RELEASE=opencpi-v2.4.3
OCPI_ROOT_DIR=/mnt/net
OCPI_SYSTEM_CONFIG=/run/media/mmcblk0p1/opencpi/system.xml
OCPI_TOOL_DIR=xilinx19_2_aarch32
OCPI_TOOL_OS=linux
OCPI_TOOL_PLATFORM=xilinx19_2_aarch32
- Operating System and version (ex. CentOS 7): Ubuntu 20.04
- Link to your project on GitLab (optional):
Possible fixes (Optional)
None at present
Acceptance criteria
Ability to run application to completion without errors.