Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New system to partially replace xforms (maybe) #14

Open
jacobpennington opened this issue Jun 29, 2022 · 0 comments
Open

New system to partially replace xforms (maybe) #14

jacobpennington opened this issue Jun 29, 2022 · 0 comments
Assignees
Labels
enhancement New feature or request

Comments

@jacobpennington
Copy link
Owner

jacobpennington commented Jun 29, 2022

The xforms system is useful in principle (i.e. being able to ensure consistent re-use of preprocessing and fitting procedures), but the implementation did not scale well and many of the xforms functions had hard-coded associations with lab-specific usage.

Idea for a new system (inspired by scikit-learn pipelines):

a Pipeline class that performs a series of data transformations. Idea being that a user can:

  1. Add operations to a Pipeline instance
  2. Define a subclass of Pipeline with the operations already specified (similar to pre-built models idea)
  3. Call Pipeline.transform(data) to perform the steps.

Example 1:

# Preprocessing only
data = {'stimulus': ...<waveforms>... , 'response': ...<spikes>... , 'state': ...}
pipe = Pipeline()
# Add function objects that expect some kind of data as the first argument
pipe.add_step(sound_to_spectrogram, input='stimulus', kwargs={'n_channels': 18})
pipe.add_step(spikes_to_rates, input='response')
pipe.add_step(split_by_fraction, kwargs={'fraction': 0.9, 'axis': 0})
# Transform the data
new_data = pipe.transform(data)

Example 2:

class MyStandardPipeline(Pipeline):
    def __init__(self):
        self.add_steps(
            (sound_to_spectrogram, input='stimulus', kwargs={'n_channels': 18}),
            (spikes_to_rates, input='response'),
            (split_by_fraction, kwargs={'fraction': 0.9, 'axis': 0})
        )

data = {'stimulus': ...<waveforms>... , 'response': ...<spikes>... , 'state': ...}
MyStandardPipeline.transform(data)

I think it would be best to limit this to preprocessing for simplicity, but model fitting could also be included.

Example:

data = {'stimulus': ...<waveforms>... , 'response': ...<spikes>... , 'state': ...}
model = Model().add_layers(...)

pipe = Pipeline()
pipe.add_steps(...<preprocessing>...)
pipe.add_step(model.fit, input=None, kwargs={'target': 'response'}  # None: get full data dict instead of one value
pipe.add_steps(...<more processing>...)
pipe.add_step(model.fit, ...)

@jacobpennington jacobpennington added the enhancement New feature or request label Jun 29, 2022
@jacobpennington jacobpennington self-assigned this Jun 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant