This particular project evolved my GLSL skills and started to teach my brain how to work on a GPU better which, naturally, means I have a very nicely decorated whiteboard to my right at the moment. And I'll be sharing it too, in a moment...
I've been thinking about doing a Computer Vision project for a little while now. I was inclined to do a desktop app and use OpenCV but on doing some research on what goes in to computer vision I realized, "Hey, I know the tools to make this work in browser." So, I did.
My first task to get Computer Vision working was take the RGB feed from my WebCam and convert it to HSV. I've done chroma keys before just using the raw RGB values but with the recommendation that HSV might work better, I figured why not give it a try? This is where the GLSL came in and what I had to write was an implementation of the following:
Let the Red RGB component in the range 0.0 to 1.0
Let the Green RGB component in the range 0.0 to 1.0
Blue RGB component in the range 0.0 to 1.0
Let a inconsequential float value to prevent division by 0.
The definition of is elided, but it just returns the maximum of the values it was passed in.
had to be accompanied with a related offset value:
So my final result for Hue is actually This was because in the HSV colorspace, is represented as a circle going round from So, in order to get select the right hue, an offset will need to be applied to get the value round to the right place.
I found some other documentation about this calculation online and found it messy. That might be normal when discussing math like, I don't really know, this but my engineering brain likes things broken down into their individual components. Hopefully, the above will help someone else out! 🙂 Now to calculate that on the GPU making it time to live up to my promise of bringing the whiteboard in...
I ended up being heavy on mix() and step() in order to reduce the branching I had to write. Branching, as I understand, is getting to be less of a concern in GPU programs but the article saying that branching implementation was improving was from 2011 and that's too recent to assume all GPU's will behave nicely. Below those function notes, there's a tracking of how my values were going to flow through in the vectors so I could end up with a the maximum in the first position, the two values I needed to subtract to get my and finally the offset that went with the selected so I could calculate properly. For the curious, the shuffling that's done with mix() and step() was effectively a vectorized ternary operator.
The conversation of RGB to HSV and writing the GLSL shader to do that was most of the effort. But the shader wasn't quite done yet. The final step was passing Min and Max values for each value: H, S, V. When all three where within the defined min and max, the shader needed to return white, when any of them were out of bounds, the shader had to return black. This way, the user can select an object in the real world by its color that will function as a paddle. Now I did look for and find shader implementations of RGB to HSV online, however, I opted to roll my own to make sure I understood what was going on as this was pretty much the beating heart of this project.
For much larger computer vision problems I probably would turn to something like OpenCV. Still getting to understand something like sometimes, we convert to the HSV color space, is I think very important! Understanding how the tools you are using work I hold to be key to actually using them well. It also allows you to answer a question I've mentioned before as being part of my process. 😉
You will need either a recent version of Firefox or Chrome in order to run this project due to the needs of getUserMedia() and WebGL.