Yeah, almost any kernel that doesn't have a sharp cut-off works moderately well, though I have't done an exhaustive test of alternatives.
I have. The key thing is how many continuous derivatives your kernel has. If the kernel itself is discontinuous, it works poorly. The standard smoothing used in Training Peaks is discontinuous (flat, then zero). A truncated Gaussian is also discontinuous, but the discontinuity can be quite small. Smooth to two derivatives is cosine-squared evaluated between -π/2 and π/2. This works quite well, but becomes time-consuming for large smoothing distances. The 2-sided exponential is smooth to 0-order, but is first-order discontinuous. It's smoother than the flat kernel, but not as good as the cosine-squared kernel. But it's very efficient to calculate.
The key in the convolution of whatever kernel is to not assume constant values between points, but rather assume linear variation between points. This improves the high-frequency rejection and reduces sensitivity to sampling. This is what I did with the exponential smoothing I described. So you want your kernel when multiplied by a linear interpolation function to be analytically integrable, if possible. With exponents and exponential-like functions, including cosine-squared, this is generally possible.