Optimizing Symmetric Dense Matrix-Vector Multiplication on GPUs