Conversation
|
Okay, I got curious for experimental results myself on this, so I've added an I've made a temporary, local amendment to notebook 4 to do the run. Results plots against everyone else to follow. Should take only an hour or two for just the one method. |
|
A few hours later: Okay, scratch that runtime. It's taking significantly longer, because there is now an extra hyperparameter which means doing 3x as many Nelder-Meads, and there is a 5x slowdown from solving over the same data series locations in five different little convex optimizations. I managed to take apart, de-scale (even the boiler), and reassemble the whole espresso machine, and this run is still not even halfway done. It's really an argument for not doing this kind of sliding window approach. |
|
Looking at results tables, the sliding TVR often finds exactly the same solution as the ordinary TVR, because they have the same RMSE and So I'm declaring this thing wholly redundant and merging this PR. I doubt anyone has been using it for anything, but if they are and get a failure in some old code, we can direct them to use |





This method got some attention in our discussion on #48 as possibly redundant, and as a dependency of$N$ , so giving a longer sequence won't really hurt. The problem is still strongly convex, so CVXPY will still use OSQP and converge fast. Rather, breaking the problem up into sections that overlap and then smoothly combining sections of the results takes significantly more computation, because we run the convex solver over any given datapoint several times.
slide_function, it makes #173 slightly harder to address. Thinking more about it, the original stated reason for havingjerk_sliding—i.e. that the convex solver might struggle and be slow when there are too many points—isn't actually particularly addressed by this thing. For TVR, solve time is linear in the data sequence length,What will change in using

jerkvsjerk_slidingis that you won't get that blending between solutions. The kernel looks like a ramp up for 1/5 of the window, flat for 3/5, ramp down for 1/5, and the stride is 1/5 such that the overlap looks like:Does combining solutions smoothly like this provide any kind of benefit? I'm not sure, and it's possible, but my statistical intuition says "no", because this kind of local ensembling doesn't have access to any more information than the global algorithm. Especially given all the other methods' approximately-equal performance, I doubt these kinds of games could make the solution more accurate, but I didn't test it against everyone else in notebook 4, because this method is limited solely to
order=3for jerk currently, which is often not the optimal choice. The method could be extended to offer different kernel choices and 1st, 2nd, and 3rd order, but I question the value of torturing this thing with these kinds of manipulations when the core algorithm,tvrdiff, is natively able to just handle the whole series. This is not quite likepolydifforlineardiffwhere by necessity we have to break the problem up.