-
Notifications
You must be signed in to change notification settings - Fork 67
Rules with Mooncake #240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rules with Mooncake #240
Conversation
Codecov Report❌ Patch coverage is
🚀 New features to boost your workflow:
|
|
Should help with #237 |
lkdvos
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work, looks like things are coming together!
We can go over it in a bit more detail in the coming week, but I already have some global comments.
First, I think this is missing the allocator argument right now, which will typically be filled in by the macro calls. Since this shouldn't be differentiated that's not too hard to add, but there are some interesting questions surrounding how to handle how temporaries are dealt with.
For example, I know that for chain rules we effectively have to disable marking intermediates as temporary, which we don't need to do here, but there might be some complications if the allocator is a preallocated buffer for example?
A secondary question is just about code organizing, where all of the indexing logic is now duplicated in 3 places- chain rules, mooncake rules and TensorKit rules. There might be some argument for defining pullbacks that can be shared, similar to how matrixalgebrakit handles this. (However, as I'm typing this, I also don't think we should block this progress because of that)
|
Also, would you mind splitting of the minimal supported version increase in a separate PR? I think this is a good change in any case, but then I can also update the branch protection rules separately |
I agree with the conclusion here but (you're probably tired of hearing this) we should open an issue. I tried to move some of the logic into the |
|
|
@lkdvos is this then good to go? |
|
Can I take a quick look? Maybe a good way to refresh my Mooncake understanding. |
Please do! We have freedom of information on this PR :) |
|
Great PR, I still find Mooncake easier to read than Enzyme. |
|
Good te merge for me (if CI passes) |
|
Your PR no longer requires formatting changes. Thank you for your contribution! |
|
One final comment and a question. Otherwise this is ready I think. |
lkdvos
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks really good, I only have one final annoying thing that we should probably discuss how to deal with.
Bringing up the allocator again (I know, apologies), the thing I'm slightly worried about is introducing silent mistakes in our generic rules because we might be intercepting too many signatures.
For example, in the Bumper extension we are simply passing a buffer as allocator argument, which will be modified, but the contributions are completely discarded.
In principle, if you then use the state of that buffer, this would now give wrong results (although it might be fair that you shouldn't ever do this but oh well).
This might not even be an issue, since it might be reasonable to claim that a cost function should never depend on the state of some caches anyways, but these are quite hard to catch silent bugs, so perhaps we should extend the docs section and make note of this?
Otherwise happy to merge this!
I agree that we should document and possibly revisit this (maybe not using the Varargs?) -- do you think it's a blocker to merge this or should we add the doc note now so we don't forget? |
|
I think it might be nice to just add a cautionary note in the docs for this. I'm definitely happy with not blocking this until we have a grasp of whether or not this is important, but I also really want to have this written down somewhere to avoid someone else using this and getting unexpected results without us having somewhere to point to like "you've been warned" 😉 |
|
Since you're more familiar with the TO docs, do you have a suggested place to put the warning? Feel free to just push a commit if you want. |
lkdvos
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spotted a typo in my own previous text, otherwise good to go from my end!
|
We may need to bypass merge here also because the GPU runners are having a very sad day |
Reimplements many of the rules (I couldn't get the automated importer to work well) using Mooncake. Tests currently do not cover complex elements as I need to make a quick PR to StridedViews first, once that is done, I will enable the tests here.