We work quite a bit with RealSense D400 series cameras (mainly D415) at Nomagic. These are popular cameras, as they offer RGB with depth input at a reasonable price — tens of times cheaper than the industrial depth cameras. Being a stereo vision camera with a static pattern, it allows for multiple cameras to work without interference. Ideally, the depth quality could be better (on the other hand, the RealSense does depth computation in real-time with high frame rates) but the main problem we hit was cameras intermittently refusing to work. These were often amplified by the complexity of our vision system, with multiple cameras connected to the system at any one time. Here, we share how to deal with some of the problems effectively.
Often, after camera ROS node process termination, the camera wouldn’t run the next time we try to use it; more rarely, the camera would stop working in the middle of an execution. In these cases, we would get error messages about I/O error, protocol error or the camera would just disappear from rs-enumerate-devices and lsusb. Replugging the camera usually helped (one case we had when we needed to reboot the computer is described below). We used to use powered hubs and then replugging the cable between the camera and the hub was always a safe bet while replugging the cable between hub and PC helped only in some cases. This suggests there are at least two kinds of problems — see also the next section.
In an experiment we had 4 cameras connected to a system and were restarting their ROS nodes 6631 times in a row. We got 45 cases when one of the four cameras didn’t reappear (nearly 1%). See this bug for details but note also that there is a big variability of results — some days the cameras worked much better, some days even worse.
Since unplugging the camera remotely is impossible, we have two more tricks:
Of course, such workarounds are not good enough for production use, so we were looking for better solutions.
The 5.8.15 firmware that comes out of the box with the cameras causes quite a bit of problems with cameras refusing to work from time to time. The 5.9.x series seems to remove a lot of those problems, with 5.10.x providing yet more stability.
Sometimes the camera would connect in USB2 mode (even in a USB3 port) and work in a very limited mode (e.g., getting both the depth and infrared images becomes impossible, and so do higher resolutions or framerate). To check if this is the case, one can run “lsusb -t”. Each camera should be visible as 4 entries with 5000M speed. If you see only 3 entries with 480M speed, you need to replug the camera.
As explained in this document, Realsense uses a noticeable amount out of USB3 theoretical bandwidth of 5000Mbps (real bandwidth is usually 3200Mbps, rarely 3600Mbps). This means that if you connect too many cameras to a single port using USB hubs, they will stop working due to bandwidth limitations. Even if you connect them to separate USB ports, motherboards tend to have one (sometimes two) internal USB hubs, so all (or half) of the ports share the 5000Mbps of theoretical bandwidth. The number of hubs can be checked with the “lsusb -t” command (look only for lines with “5000M”).
One can increase the total bandwidth by buying a PCIe card with more USB ports. Cheaper cards have an internal hub with all ports sharing the 5000Mbps bandwidth, but more expensive cards have separate bandwidth for each port.
Another solution is to decrease the framerate or resolution to consume less bandwidth.
Resetting the camera before usage proved to be a very helpful trick, particularly for the older firmwares (there are rumors that this has improved in the new firmware versions). We encountered many cases of the camera successfully starting only after a programmatic reset such as: https://github.com/icarpis/realsense/commit/292e7f8204aa1ab03633a0b161e47ccdbdb69bc4. Adding such code to camera startup considerably decreased the number of cases where they wouldn’t start.
However, this exposes a bug present in some Linux kernels, including the one currently used in Ubuntu 16.04 LTS. As described in https://www.spinics.net/lists/linux-media/msg135855.html, after disconnecting a device (either through physically unplugging it or by the reset code mentioned above) some versions of the kernel leak /dev/media* devices. Once all 255 of them are used, RealSense won’t work. This is not a problem in kernel 4.4, is present in kernel 4.15, and we think it’s fixed in kernel 4.18.
The symptoms are that in “lsusb -t” shows “Driver=” instead of “Driver=uvcvideo”, in kernel logs one can find “media: could not get a free minor” and, of course, “ls /dev/media*” shows 255 entries. The only workaround we know of is a reboot (or the solution of using a kernel that doesn’t suffer from this problem 🙂 ).
If your depth stream starts looking like this:
it may be because the depth calibration data got corrupted. One can use the “Dynamic Calibration Tool” to restore the calibration data with “Intel.Realsense.CustomRW -g” and then run the calibration procedure.
Despite some search for higher-quality cables, quite a few of the problems we had were due to the cables, especially for a camera that was moving with our robot’s arm. The symptoms are that at some point a specific camera starts to fail more and more often, until one day it doesn’t work at all. After replacing the cable with a new one it starts working again. We don’t yet have a good way of detecting that a cable is about to break, but GetPortErrorCount command introduced in USB 3.1 or counting the number of incomplete frames received may be of help (for the former, the data doesn’t seem to be exposed by any available command line tool, so we may need to write our own, the latter can be obtained with a RS2_NOTIFICATION_CATEGORY_FRAME_CORRUPTED notification).
Note that a side-effect of USB3 theoretical bandwidth being 5000Mbps is that it interferes with 2.5GHz radio equipment — we had a wireless keyboard not working when plugged next to RealSense and it was likely degrading RealSense performance as well. Making sure there are no such devices nearby or using shielded USB cables may help here.
Many tricks described here were a result of interaction with https://realsensesupport.intel.com, which we are grateful for. One can open tickets there and await a prompt reply.