New Optimizer code by pavelkomarov · Pull Request #125 · florisvb/PyNumDiff

pavelkomarov · 2025-07-07T23:45:55Z

big changes! Basically fully rewrote the optimizer code. Getting same answers for spectraldiff, going to follow on with moves of some of the other modules, deleting now-unnecessary code as I go.

… answers for spectraldiff, going to follow on with moves of some of the other modules, deleting now-unnecessary code as I go.

pavelkomarov · 2025-07-08T03:09:49Z

The jupyter notebooks 2a and 2b should be moved over to use this syntax, maybe in this PR, maybe in a follow-on. Essentially now all the shadow-methods don't exist, and you instead call optimize with the original method.

Follow up: Done and folded in to this (now giant) PR.

pavelkomarov · 2025-07-08T03:35:27Z

pynumdiff/linear_model/_linear_model.py

 # Savitzky-Golay filter #
 #########################
-def savgoldiff(x, dt, params=None, options=None, polynomial_order=None, window_size=None, smoothing_win=None):
+def savgoldiff(x, dt, params=None, options=None, poly_order=None, window_size=None, smoothing_win=None):


Decided to rename this parameter, because typing out the full word can take up a bunch of space and feel unnecessary.

pavelkomarov · 2025-07-08T03:36:15Z

pynumdiff/linear_model/_linear_model.py

+    if window_size < poly_order*3:
+        window_size = poly_order*3+1
+    if window_size % 2 == 0:
+        window_size += 1


This thing was failing because the optimizer wanted to give even-length windows, so I've added back this +1.

pavelkomarov · 2025-07-08T03:36:50Z

pynumdiff/optimize/__init__.py

        "CVXPY to be installed. You can still pynumdiff.optimize for other functions.")

-from . import finite_difference, smooth_finite_difference, linear_model, kalman_smooth
+from ._optimize import optimize


only need to import a single thing now, one function to rule them all

…rable, descriptive variable name

pavelkomarov · 2025-07-08T03:43:58Z

pynumdiff/optimize/_optimize.py

+             None for k,v in search_space_types.items()]
+
+    # wrap the objective and scipy.optimize.minimize because the objective and options are always the same
+    _obj_fun = partial(_objective_function, func=func, x=x, dt=dt, singleton_params=singleton_params,


I'm trying to keep down the amount of information flying around. The use of partial and the comment make clear that essentially a bunch of stuff that flows in to the optimize function stays constant throughout the process, so we only need to focus on the point (and translation to and from the full parameter dictionaries).

pavelkomarov · 2025-07-08T03:44:40Z

pynumdiff/optimize/_optimize.py

+        padding=padding)
+    _minimize = partial(scipy.optimize.minimize, _obj_fun, method=opt_method, bounds=bounds, options={'maxiter':maxiter})
+
+    with Pool(initializer=filterwarnings, initargs=["ignore", '', UserWarning]) as pool: # The heavy lifting


Some of the subprocesses can raise warnings, because I print warnings throughout the code, for example when using a kernel with even width. There is no way to keep the optimization from querying functions with parameters that may throw warnings, so this keeps the procedure from making a ruckus.

pavelkomarov · 2025-07-08T03:47:12Z

pynumdiff/optimize/_optimize.py

+    # results are going to be floats, but that may not be allowed, so convert back to a dict
+    opt_params = {k:(v if search_space_types[k] == float else 
+                    int(np.round(v)) if search_space_types[k] == int else
+                    v > 0.5) for k,v in zip(search_space_types, opt_point)}


This translation-back-to-dict code is awfully compressed. It may need to become a full-on encode/decode function pair if we extend this to handle strings. But I spent some time trying to figure out how to do that by adaptively mapping newly-seen strings to ordinal numbers, and it's actually mildly nontrivial to remember such a mapping. We could map all the strings in the codebase to arbitrary numbers, but then some are closer than others; we could map them to simplex vertices or something instead, but then you start adding several more dimensions to the search space. I'm not sure it's worth taking on that added complexity, honestly, given how few parameter settings are strings. Ideally we generalize all the way, especially because categorical differences also describe the case of comparing methods themselves. It may be more expeditious to take a different approach entirely, like Bayesian optimization with a library that handles this sort of thing a bit. We should think about it at a high level.

pavelkomarov · 2025-07-08T03:49:29Z

pynumdiff/optimize/_optimize.py

+from ..kalman_smooth import constant_velocity, constant_acceleration, constant_jerk
+
+
+# Map from method -> (search_space, bounds_low_hi)


All the wrapper classes basically just encoded the information in this big dictionary, but much more spread out. Here it's all in one place, and a massive amount of code can simply go away.

pavelkomarov · 2025-07-08T03:52:03Z

pynumdiff/tests/test_optimize.py

-from pynumdiff.optimize.kalman_smooth import constant_velocity, constant_acceleration, \
-    constant_jerk
-from pynumdiff.utils.simulate import pi_control
+from ..finite_difference import first_order as iterated_finite_difference # actually second order


We now import the functions themselves, not identically-named shadows with different parameters. If we don't give extra search space information to the optimizer, it can fish that information out of its memory. bounds are enforced.

pavelkomarov · 2025-07-08T03:52:31Z

pynumdiff/tests/test_optimize.py


 # simulation
-noise_type = 'normal'
-noise_parameters = [0, 0.01]


Some of these were defined but not used. I've passed them to the simulation function explicitly.

pavelkomarov · 2025-07-08T03:52:57Z

pynumdiff/tests/test_optimize.py

 log_gamma = -1.6 * np.log(cutoff_frequency) - 0.71 * np.log(dt) - 5.1
 tvgamma = np.exp(log_gamma)

-def get_err_msg(actual_params, desired_params):


This was only used in one test function, and it's not really necessary.

pavelkomarov · 2025-07-08T03:53:45Z

pynumdiff/tests/test_optimize.py

-    return err_msg

-
-def test_first_order():


See #104 and #120. This thing isn't actually first order, so I renamed it a bit internally.

pavelkomarov · 2025-07-08T03:54:52Z

pynumdiff/tests/test_optimize.py

+    params1, val1 = optimize(meandiff, x, dt, search_space={'num_iterations':1}, tvgamma=tvgamma, dxdt_truth=dxdt_truth)
+    params2, val2 = optimize(meandiff, x, dt, search_space={'num_iterations':1}, tvgamma=0, dxdt_truth=None)
+    assert params1['window_size'] == 5
+    assert params2['window_size'] == 1


I'm happy to report that in almost all cases the answers that come out here are identical, so for all my complicated changes to the optimizer, nothing broke.

pavelkomarov · 2025-07-08T03:56:18Z

pynumdiff/utils/evaluate.py

    return r.rvalue**2
+
+
+def total_variation(x, padding=0):


This felt like it belonged here instead of in the utilities, since it's only used as part of calculating a metric in the absence of ground truth. Putting it here allows it to take padding with a bit more continuity, too, which enables us to skip calculating padding in the optimizer itself.

pavelkomarov · 2025-07-08T04:02:20Z

pynumdiff/utils/evaluate.py

+        padding = int(0.025*len(x))
+        padding = max(padding, 1)
+
+    return np.sum(np.abs(x[1:]-x[:-1]))/(len(x)-1)  # mostly equivalent to cvxpy.tv(x2-x1).value


"mostly". https://www.cvxpy.org/_modules/cvxpy/atoms/total_variation.html#tv. Our version just normalizes for length, and theirs doesn't.

pavelkomarov · 2025-07-08T18:00:30Z

pynumdiff/tests/test_optimize.py

+    params1, val1 = optimize(polydiff, x, dt, tvgamma=tvgamma, dxdt_truth=dxdt_truth)
+    params2, val2 = optimize(polydiff, x, dt, tvgamma=0, dxdt_truth=None)
+    assert (params1['poly_order'], params1['window_size']) == (6, 50)
+    assert (params2['poly_order'], params2['window_size']) == (4, 10)


Here's an example where I got different answers, but these actually agree with the ones that used to be here before I touched the codebase. I had to change them to [2, 10] to get this to pass after my slide_function changes, I think. Why that changed things, I don't know. Why these changes recover what the optimizer used to be doing, I don't know. My changes have complemented each other in the right way, I guess.

…RMS numbers worsened because no longer excluding the beginning and ends of the vectors in calculation), and changed the evaluation file because I realized padding=None makes it sound like no padding, not 2.5% padding, and padding wasn't getting used properly in the error_correlation and total_variation functions

removed a couple unnecessary imports, reran basic_tutorial notebook (…

pavelkomarov · 2025-07-08T19:28:48Z

examples/1_basic_tutorial.ipynb

    "noise_type = 'normal'\n",
    "noise_parameters = [0, 0.1]  # mean and std\n",
    "\n",
    "# time step size and time series length in TIME\n",
    "dt = 0.01\n",
-    "timeseries_length = 4"
+    "duration = 4"


When I went to update the 2a and 2b notebooks, I realized I hadn't rerun this one, so now I have updated some things to agree with the updated evaluate and simulate files. Note that the default behavior is now to not pad when calculating the metrics, which means we're including the 2.5% at the edges which tends to perform more poorly. Thus RMS numbers worsen. Without this change, they're exactly what they were before.

…-update-for-optimizer

…optimizer-more

…changed the code to agree

…ff into notebooks-update-for-optimizer

…wn step_size, because when I expanded the search space to include step size, the answer changed

Notebooks update for optimizer

…e it clashes with the module name, so remove the line from the __init__.py

pavelkomarov · 2025-07-08T23:51:09Z

pynumdiff/utils/evaluate.py

+        padding = max(padding, 1)
+    x = x[padding:len(x)-padding]
+
+    return np.linalg.norm(x[1:]-x[:-1], 1)/len(x) # normalized version of what cvxpy.tv does


This used to be normalized by len(np.ravel(x)[0:-1]), which is $m-1$ rather than $m$. The paper says normalize by $m$ in the math, so I've made the code agree.

…=0 while optimization notebooks use the optimizer's default, which was padding='auto'. Changed all default paddings to be 0 so the optimization optimizes the number shown on the plots by default in the 2a and 2b notebooks.

pavelkomarov · 2025-07-09T00:32:11Z

pynumdiff/tests/test_optimize.py

-    params_1, val_1 = first_order(x, dt, params=None, options={'iterate': True},
-                                  tvgamma=tvgamma, dxdt_truth=dxdt_truth)
-    params_2, val_2 = first_order(x, dt, params=None, options={'iterate': True},
-                                  tvgamma=0, dxdt_truth=None)


I realized the way these were being called doesn't make any sense, because if dxdt_truth is given, tvgamma isn't used, and if dxdt_truth isn't given, tvgamma is used. So why was the carefully-calculated tvgamma above only passed in when dxdt_truth was, and 0 passed in when we would need to use tvgamma? Correcting this changes some of the solutions to the optimizations.

…on build to pass

…optimizer-more

… for docs build

…optimizer-more

big changes! Basically fully rewrote the optimizer code. Getting same…

aa5f4b7

… answers for spectraldiff, going to follow on with moves of some of the other modules, deleting now-unnecessary code as I go.

pavelkomarov changed the title ~~big changes! Basically fully rewrote the optimizer code. Getting same…~~ New Optimizer code Jul 7, 2025

pavelkomarov added 5 commits July 7, 2025 17:51

moved rest of linear_model optimization

3a6433a

moved finite difference to new optimizer

f05e9ed

moved smooth fd to new optimization

18d4bc1

moved TVR to new optimizer

0ce0922

moved kalman module, to complete all 5

40b460f

pavelkomarov mentioned this pull request Jul 8, 2025

One of the optimizer's tests randomly fails sometimes #82

Closed

addressed #82

d4769b5

pavelkomarov mentioned this pull request Jul 8, 2025

Optimizer code deduplication and kwargs exploitation #105

Closed

pavelkomarov requested a review from florisvb July 8, 2025 03:33

pavelkomarov commented Jul 8, 2025

View reviewed changes

changed init_conds to search_space because it's a flashier, more memo…

209731d

…rable, descriptive variable name

pavelkomarov commented Jul 8, 2025

View reviewed changes

pavelkomarov added 2 commits July 8, 2025 12:18

Merge pull request #126 from florisvb/notebooks-update-for-optimizer

8965557

removed a couple unnecessary imports, reran basic_tutorial notebook (…

pavelkomarov commented Jul 8, 2025

View reviewed changes

pavelkomarov added 9 commits July 8, 2025 13:29

Merge branch 'master' of github.com:florisvb/PyNumDiff into notebooks…

a4ef0b3

…-update-for-optimizer

Merge branch 'master' of github.com:florisvb/PyNumDiff into hack-the-…

687778f

…optimizer-more

realized TV is defined as normalizing by m, not m-1 in the paper, so …

2cae18a

…changed the code to agree

Merge branch 'hack-the-optimizer-more' of github.com:florisvb/PyNumDi…

3c7e453

…ff into notebooks-update-for-optimizer

fully updated notebook 2a

3d56cef

notebook 2b now uses the new optimizer

d5f212b

added search_space arg to test_polydiff in optimizer tests to lock do…

a5b79b8

…wn step_size, because when I expanded the search space to include step size, the answer changed

Merge pull request #128 from florisvb/notebooks-update-for-optimizer

47058db

Notebooks update for optimizer

importing the optimize function to the top level doesn't work, becaus…

3d0aff8

…e it clashes with the module name, so remove the line from the __init__.py

pavelkomarov commented Jul 8, 2025

View reviewed changes

pavelkomarov added 2 commits July 8, 2025 17:24

improved docstring, hadn't said what padding does at the optimizer level

6949e81

pavelkomarov commented Jul 9, 2025

View reviewed changes

pavelkomarov added 11 commits July 14, 2025 15:45

added tools field to .readthedocs.yaml, trying to get the documentati…

feef529

…on build to pass

fixing another readthedocs build error

a6787bb

more wrestling with readthedocs

0191a82

removed obsolete tag from the readthedocs yaml

5a3a246

trying readthedocs yet again

49ad154

docs didn't build right for a couple modules. fixing

62d27ee

Merge branch 'master' of github.com:florisvb/PyNumDiff into hack-the-…

3bac5d5

…optimizer-more

added cvxpy to documentation requirements so all methods get imported…

02994b7

… for docs build

parmesan

267b8c8

Merge branch 'master' of github.com:florisvb/PyNumDiff into hack-the-…

7b5eaee

…optimizer-more

updated doc source to reflect changes on this branch

7f826c8

pavelkomarov force-pushed the master branch from 3258a60 to 2822a40 Compare July 15, 2025 00:26

Merge branch 'master' of github.com:florisvb/PyNumDiff into hack-the-…

c188f42

…optimizer-more

pavelkomarov mentioned this pull request Jul 15, 2025

made first order actually first order and added fourth order #120

Merged

pavelkomarov merged commit 79f3eb9 into master Jul 15, 2025
1 check passed

pavelkomarov deleted the hack-the-optimizer-more branch July 15, 2025 21:35

		from ..kalman_smooth import constant_velocity, constant_acceleration, constant_jerk


		# Map from method -> (search_space, bounds_low_hi)

Conversation

pavelkomarov commented Jul 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pavelkomarov commented Jul 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pavelkomarov Jul 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pavelkomarov Jul 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pavelkomarov Jul 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pavelkomarov Jul 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pavelkomarov Jul 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pavelkomarov Jul 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pavelkomarov Jul 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pavelkomarov Jul 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

pavelkomarov commented Jul 7, 2025 •

edited

Loading

pavelkomarov commented Jul 8, 2025 •

edited

Loading

pavelkomarov Jul 8, 2025 •

edited

Loading

pavelkomarov Jul 8, 2025 •

edited

Loading

pavelkomarov Jul 8, 2025 •

edited

Loading

pavelkomarov Jul 8, 2025 •

edited

Loading

pavelkomarov Jul 8, 2025 •

edited

Loading

pavelkomarov Jul 8, 2025 •

edited

Loading

pavelkomarov Jul 8, 2025 •

edited

Loading

pavelkomarov Jul 9, 2025 •

edited

Loading