-
Notifications
You must be signed in to change notification settings - Fork 247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Core] Adding debug check in CheckAndCorrectZeroDiagonalValues
#12782
[Core] Adding debug check in CheckAndCorrectZeroDiagonalValues
#12782
Conversation
kratos/spaces/ublas_space.h
Outdated
} | ||
} | ||
KRATOS_ERROR << "Diagonal term (" << Index << ", " << Index << ") is not defined in the system matrix" << std::endl; | ||
#else | ||
rA(Index, Index) = scale_factor; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isnt this still some find
operation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, what alternative there is?. Only executed in debug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, i would use the following (surely to be slightly adapted to make it to compile)
auto pos = std::lower_bound(&Acolndices[col_begin],&Acolndices[col_end]);
if pos != &Acolndices[col_end] //it was found
rValues[pos - &Acolndices[col_begin]] = scale_factor;
#ifdef _DEBUG
else
KRATOS_ERROR ...
#endif
this code shall be a bit faster than what you have, and it will use a binary search
if you do not like this, there is a function "find_element(i,i)" which will do the same as calling the lower bound.
i think my version should be slightly faster because we already have col_begin and col_end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anycase looks like we have tests failing in full debug, hurray
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you take a look to what are the cases failing? to know if it is a realistic scenario or just something we use in the tests (my bet is that there are constraints ... )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With your code:
[ FAILED ] KratosCoreFastSuite.BasicDisplacementBlockBuilderAndSolver
[ FAILED ] KratosCoreFastSuite.BasicDisplacementBlockBuilderAndSolverWithZeroContribution
[ FAILED ] KratosCoreFastSuite.BasicDisplacementBlockBuilderAndSolverWithConstraints
[ FAILED ] KratosCoreFastSuite.BasicDisplacementBlockBuilderAndSolverWithInactiveConstraints
[ FAILED ] KratosCoreFastSuite.BasicDisplacementBlockBuilderAndSolverWithConstraintsAuxiliarNode
[ FAILED ] KratosCoreFastSuite.BasicDisplacementBlockBuilderAndSolverWithConstraintsAuxiliarNodeInverted
[ FAILED ] KratosCoreFastSuite.BasicDisplacementBlockBuilderAndSolverAllDoFsMaster
[ FAILED ] KratosCoreFastSuite.BasicDisplacementBlockBuilderAndSolverAllDoFsMasterFromStructureSide
[ FAILED ] KratosCoreFastSuite.BasicDisplacementEliminationBuilderAndSolverWithZeroContribution
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my code was wrong. it should be
pos - &Acolndices[0]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, still failing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, still failing
Wait it was my fault, testing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay now it only fails in Debug (in my machine) and the failing tests are the same:
[ FAILED ] KratosCoreFastSuite.BasicDisplacementBlockBuilderAndSolver
[ FAILED ] KratosCoreFastSuite.BasicDisplacementBlockBuilderAndSolverWithZeroContribution
[ FAILED ] KratosCoreFastSuite.BasicDisplacementBlockBuilderAndSolverWithConstraints
[ FAILED ] KratosCoreFastSuite.BasicDisplacementBlockBuilderAndSolverWithInactiveConstraints
[ FAILED ] KratosCoreFastSuite.BasicDisplacementBlockBuilderAndSolverAllDoFsMaster
[ FAILED ] KratosCoreFastSuite.BasicDisplacementEliminationBuilderAndSolverWithZeroContribution
well, i am afraid we will need to debug then ... really calling A(i,i) would have an enormous performance impact (apart for threading issues)... so i really hope we are not doing it in real life |
Looks like the iterators were giving issues, should work now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Letting the user know what went wrong is the absolute minimum.
Ideally, an exception would be thrown even in release mode.
Co-authored-by: Máté Kelemen <[email protected]>
Co-authored-by: Máté Kelemen <[email protected]>
const auto it_Acol_indices_row_end = it_Acol_indices_begin + col_end; | ||
|
||
const auto lower = std::lower_bound(it_Acol_indices_row_begin, it_Acol_indices_row_end, Index); | ||
const auto upper = std::upper_bound(it_Acol_indices_row_begin, it_Acol_indices_row_end, Index); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const auto upper = std::upper_bound(it_Acol_indices_row_begin, it_Acol_indices_row_end, Index); | |
const auto upper = std::upper_bound(lower, it_Acol_indices_row_end, Index); |
sry for the pickiness, but if we're doing it, we may as well do it properly.
No need to start searching from the range begin again. lower_bound <= upper_bound always
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't think you need the upper (in line with the suggestion of @matekelemen):
if(empty) {
const auto it_Acol_indices_row_begin = it_Acol_indices_begin + col_begin;
const auto it_Acol_indices_row_end = it_Acol_indices_begin + col_end;
const auto lower = std::lower_bound(it_Acol_indices_row_begin, it_Acol_indices_row_end, Index);
if (lower != it_Acol_indices_row_end && *lower==j) { // Index was found
r_Avalues[std::distance(it_Acol_indices_begin, lower)] = scale_factor;
} else {
#ifdef KRATOS_DEBUG
KRATOS_ERROR << "Diagonal term (" << Index << ", " << Index << ") is not defined in the system matrix" << std::endl;
#else
per regards the error i am not really sure.
This s a scaling operation. If the row is not there than i would say it makes no sense to scale it ... not that this is an error ...
What is actually wrong in my opinion is to put a nonzero diagonal where there was no diagonal (in a sense equivalent to having a zero diagonal)
ouch, we cannot use both lower_bound and upper_bound, the cost is the double ... at this point we could use rA.find_element(i,j) |
it's two O(log(n)) operations. A linear scan is ... linear, so think we can still go with this. |
alternatively, we could skip the upper_bound and check the value of what lower_bound returns. If it's the index we're looking for, the main diagonal entry is stored, otherwise it isn't. just keep weird cases in mind like completely empty rows. |
That's what I was doing |
Co-authored-by: Philipp Bucher <[email protected]>
@loumalouomega in release mode, does it run? what i am asking if without throwing the error it does run. If it does, than it is an indication that my suggestion is correct: do nothing does not matter if the diagonal is not there |
Yes, in release yours worked |
I know this PR is still under development, but just a heads up that in our pipelines (which run in release) we get a couple of tests failing, all with messages like:
So we actually end up in the situation where we have a completely zero column after these changes. This was run with the last commit being @matekelemen suggestion, I just merged the latest changes to see if there is a difference (but I only see some simplifications, so it's probably the same situation) Update: situation is unchanged after running with the latest changes |
Now the tests failing are the ones froim Geo app |
Indeed, for the 'Custom' builds the errors are the same as we find in our release builds, while the debug ones run into the newly added error: |
yep |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I guess what we need to decide then is whether we allow matrices with completely empty rows or not.
Personally, I don't think it would make any sense to allow them. If I recall correctly, matrices aren't manipulated after this point and are passed to linear solvers, who'll obviously fail in such cases.
Maybe the @KratosMultiphysics/geomechanics team can shed some light on how they're getting matrices with empty rows, and whether they are built with empty rows on purpose.
Probably the right approach is to add the diagonal item in the ConstructMatrixStructure |
But that implies to modify the B&S. BTW, a priori if using the Block B&S the diagonal is guaranteed, if not in that case I will need to take a look. |
What is the status of this, what should we do? |
I had some discussions within the team and from the geomechanics side we don't directly expect to have empty rows in our matrix. We can have a look (Friday at the earliest) using this branch to see what's going on in the few specific failing test cases. |
Hi @loumalouomega, we found some time and had a quick look at the failing test cases: the repeating pattern we see in the failing tests is our 'excavation process'. In short what it does is (de)activate elements/conditions in a certain model part (for more details see Kratos/applications/GeoMechanicsApplication/custom_processes/apply_excavation_process.cpp Lines 32 to 46 in 8776deb
When I remove this process from the test, it completes without the error (but of course the results are different, since we don't use the deactivation functionality). Does the fix in this PR take into account what happens when certain elements are inactive? Could that for example result in certain diagonals not existing in the A matrix when calling this function? FYI @mnabideltares |
If you deactivate elements and the connectivity changes maybe you should update the dofs with the flag |
I forgot what is the status of this |
I think that people from @KratosMultiphysics/geomechanics needed to update tests? |
@loumalouomega We are currently looking into the failing tests. The issue seems to be that in our workflow DoF are added for nodes which are only part of an inactive element. Their matrix entries remain uninitialized. We are looking into ways to fix this at our end. Ideally by not adding the DoF for inactive nodes in the first place (if this is possible depends a bit on the order of operations) and in the worst case add a diagonal value to the matrix to unblock this PR. |
Okidoki |
This is strange. Dovrete is constructed using all elements (not just the active ones) exactly to avoid this provlem. Same for construct matrix structure |
Wait, so If it's a core feature and the system ignores it, then it must be fixed in core. |
In the core (block builder) the zeros of all of the inactive elements are
added to the matrix and to the dofset. this is not as bug it is a feature.
If that was not so you would need to reform the dofset and the nonzeros
every time you activate/deactivate an element...and that is typically more
expensive than solving a larger system
El lun, 18 nov 2024, 21:09, Máté Kelemen ***@***.***>
escribió:
… @loumalouomega <https://github.com/loumalouomega> We are currently
looking into the failing tests. The issue seems to be that in our workflow
DoF are added for nodes which are only part of an *inactive* element.
Their matrix entries remain uninitialized. We are looking into ways to fix
this at our end. Ideally by not adding the DoF for inactive nodes in the
first place (if this is possible depends a bit on the order of operations)
and in the worst case add a diagonal value to the matrix to unblock this PR.
Wait, so Dofs from inactive Nodes get added to the linear system anyway?
Is the activation system unique to the geo app, or is it a core feature?
If it's a core feature and the system ignores it, then it must be fixed in
core.
—
Reply to this email directly, view it on GitHub
<#12782 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB5PWEM6SZGABVKA4UMLIET2BHRJHAVCNFSM6AAAAABQQ5P576VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIOBTGAYDEMJUGQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Got it @RiccardoRossi, thanks. |
We are still looking into the failing tests, it seems that the issue is related to our time integration schemes performing some operations only on active elements (including GetDofList and EquationId, which seems to be incorrect considering the block builder and solver adds zeros for inactive elements). We'll do some further testing if that is really the only remaining issue and try to provide a fix soon, such that this PR can be merged. |
Hi @loumalouomega, yesterday we merged a change on our end fixes the failing test cases in geo, at least locally (see #12864 for more details). Could you bring this branch up to date with master to see if it is also fixed in the github pipelines? |
📝 Description
Initial work for #12768.
Basically checks in debug that the daigonal terms exist instead of directly asign them. This will not actually fix the issue in not initialzed diagonals, but this is a first step.
🆕 Changelog
CheckAndCorrectZeroDiagonalValues