I have made some awesome progress this week!
This video was recorded directly on the Raspberry Pi living on Smokey. It’s correctly detecting targets and giving a reasonable approximation of their distance away as well as vaguely correct angles (sometimes the angle is completely wrong, but it’s mostly okay and can at least be used to tell the ‘copter to turn clockwise/anti-clockwise or not). I have it running at a reasonable framerate -- having been testing it on much faster laptops, I forgot that I would be transferring my code to a 900MHz ARM processor. The first time I tested it on the Pi, I was getting a feeble ~3 frames per second. Nearly 20 frames/sec now, at an 800x600 resolution -- not bad! I was aiming for an >10fps framerate.
Things that I have done since last Friday:
Started a new C++ project from scratch, with a better idea of how the system should be designed.
Implemented the finite state machine pattern recognition method mentioned in the previous post (StateMachine class)
Written the loop which uses this to track Markers (TargetFinder::doTargetRecognition), and converts reasonable markers (ones that are square, not too small and not too large, roughly the same size and roughly the right distances apart) into Target objects.
Multi-level thresholding using fixed bins -- thresholding the input image at multiple fixed levels (using 3-5 bins has empirically been pretty good), if marker patterns are detected at *any* of those thresholds, then they’re incorporated into nearby marker objects that are agnostic about thresholds.
Implemented the code that calculates the angle of a target -- this is still pretty much ripped from Neal’s original code (thanks Neal!) as it’s pretty standard trigonometry, finding the shortest angle in an isosceles triangle... but thinking in radians hurts my brain and it does glitch out occasionally, so I need to maybe re-think this part.
Picking the best target -- right now this is simply the biggest target, but when there are lots of false detections, the real target in the image may be a smaller one. I think I may need to get altitude from the flight controller, which would allow me to get an approximation of how big the real target should be given the ‘copter’s current distance from the ground.
Calculating the real distance -- knowing the focal length and sensor size of the Pi camera, the real size of the target, the image dimensions and the target dimensions both in pixels, the CameraModel class calculates how far away it thinks a target is. This is fairly accurate, having tested it (In the video, I was holding the ‘copter between 1 and 2 metres off the floor).
Persistent tracking of a known target between frames -- this is handled by the PersistentTarget class. The tracked targets tend to flicker in and out of existence due to camera motion, and can vary wildly in size when the camera exposure changes. The PersistentTarget is one that has a finite lifetime and “dies” if its’ information is not updated within a given timeout period (currently, 1 second). It gets updated with the information from the best detected target (if there is one) every frame, with a smoothing factor applied, so that falsely detected targets have less of an influence than correctly detected targets. When false detections occur, they tend to flicker all over the image and only last for one or two frames. The next thing to try is implement a similarity heuristic so the target is only updated by detected targets when they are acceptably similar in shape and angle, otherwise it dies. The one concern here is what if a falsely-detected target is persistent.
Determining flight controller stick commands based on the targets position in the image -- this is done by the Navigator class. This is shown by the green dots in the video. Again, this is updated by the persistent targets’ position and angle. If one isn’t detected, the controller’s pitch, roll and yaw (or rather, vertical, horizontal and rotation) values smoothly reset to centre values. The next step would be to implement this as a PID controller that takes into account framerate, barometer altitude and calculated distance of the target -- so that it’s more gentle on the pitch/roll commands when it’s high up in the air, and it won’t oscillate when trying to keep the target centred. Yaw commands are a little simpler, but need a similar PID controller that stops it from continuously reversing direction when the angle calculations go a little haywire (ie. reversing direction by 180 degrees every few frames).
When I first tested the code on the Pi, it was slooow. I ran my program through cachegrind, which showed me that most of the processing time was spent inside the StateMachine::step function. That was to be expected -- it’s called for every pixel in a row, plus some more when a target pattern is detected horizontally. So I simplified the code a little (converted all double-precision floats to single-precision ones, replaced with integers where possible), but that didn’t provide much of a performance boost. This code can behave massively differently on different CPU architectures! The Pi 2 has an armv7 CPU. Previously I was compiling for my Intel 64-bit laptops, which have loads of hardware optimizations for floating point calculations and vectorization. So does that ARM chip - VFP and NEON are both features of the CPU that can accelerate integer and floating point calculations. Despite the Arch Linux version running on my Pi being the “hard-float” variant, it wasn’t compiling my code with these optimizations by default. I had to tell the compiler to use VFP and NEON when compiling on Pi. That did speed stuff up a little over 2x - framerates going from 2-3fps to 8-10fps.
Possibly I could squeeze out some better performance with more compiler optimizations, but the root of the problem is still that StateMachine::step function. I added a row_step parameter -- allowing the target detector to skip every N rows (currently it only looks at every 4 rows) got me to just over 20fps with no appreciable loss in marker detection accuracy. Cool!
Annoyingly, I can’t get a particularly high camera resolution using the v4l2 Pi-Cam driver. I would like to get full 1080p, as I did some calculations and at that resolution I could track targets from almost 20 metres in the air. Right now I seem to be limited to 800x600 at most -- asking for a higher resolution and my program just sits there with 100% CPU usage but hanging, waiting on nonexistent video input... If I had time I would maybe get frames directly, without going through the extra abstraction of the Video4Linux camera driver. But that’s lots of low-level stuff I don’t have time to try.
So now I’m at that point where I can start working on the fun quadcopter-controlling bits. So here’s the loads of stuff I want to do next:
Work out how to reduce the false markers detected! In video frames with high variance, there are a lot of false markers because the noise creates a lot of potentially valid patterns. Looking at a running average and variance and using that to determine the distribution of the threshold bins may help here.
Fix Smokey! My last flight resulted in a bit of a warped frame and two of the motors have been knocked out of alignmment. Needs some expanding foam or something to fill in the holes where EPP foam has been ripped out.
Fix my code for commanding my flight controller. Currently I can request data from my MultiWii controller, but it seems to ignore controller commands. I think my checksums are wrong or something. So I think writing a c++ library I can use in my vision code but also from Python would be good.
Write a high-level flight controller based on different “behaviours” -- it should be able to program GPS waypoints on the MultiWii controller, and detect when it’s reached the last one -- assuming the last one is a landing target, it should then be able to switch to “precision land” mode which will tighten up the PIDs, switch on the vision processing stuff and put it into accelerometer/barometer/altitude hold mode. Then it can use the joystick commands generated by the Navigator class to gently guide the ‘copter over the target and slowly reduce the altitude until the last recorded distance is 1 meter or less.
Figuring out whether it really is safe to land at this point (ie. reduce the throttle to its minimum and then completely switch off the motors after a short period / disarm the ‘copter) is something that needs figuring out. To start with I think it should just stop sending the FC commands, allowing me to resume control manually with a radio controller.
Tweak the PIDs on smokey by hand to find values that are good for precision landing (and tighten them up generally as currently she drifts all over the place in the gentlest of breezes). I also need to test out the GPS waypointing.
(Bearing in mind I now have 3 weeks left to do all this!)