Going further, power for power on the current generation, for SGEMM, i'd expect GPUs to beat CPUs by maybe 5x (6700k has a theoretical 512 gflops at 91W, while xeons can come a bit shy of 1.5 tflops at 145w)
Which is still pretty major, but nowhere near the wild claims that used to be common in academic papers. It's outdated, but  is still pretty good reading.
> Is this a typical performance profile for doing ray tracing in OpenGL? I expected tens of times faster than a CPU on commodity hardware.
SIMD instructions on modern CPUs are getting really good, within the ballpark of GPU speed in some cases. Here's the classic paper  on this, and the results have gotten even better since then with AVX2, etc.
"Debunking the 100X GPU vs. CPU Myth: An Evaluation of Throughput Computing on CPU and GPU" is a good paper on this; obviously has an agenda, but rings true in my experience.