Tail risk used to be a slow, for-loop mess. I wanted to see how often extreme returns showed up in each rolling window... but calculating percentiles across windows with [[NumPy]] was a bottleneck. Switched to CuPy + prebatched rolling ops. Now the tail risk calc is fully GPU-parallel and ~100x faster. Still using `.nanpercentile`, but chunked intelligently so we don’t OOM. This unlocked running it across all periods (1h to 1M) in one pass. Hugely useful for stress modeling. [[Tail Risk]] [[Vectorization]] [[CuPy]] [[NumPy]] [[Serendipity]]