-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
new implementation of overlap management #14
Comments
Hi @ClementMassonnaud , I'm currently checking your new implementation of the overlapping admission algorithm. (I am using a subset of a real dataset, so no simulated data at this moment). I've run into a couple of issues that might need our attention:
whereas the new implementation times at:
I think this is mainly die to the "per subject" calculation of overlaps, although I haven't checked this. However, I used to use the by=sID method before switching to the full list method. @PascalCrepey taught me this, and I guess you know it too. It's quite fast... basically you calculate all differences between record N and N+1 and exclude any differences where sID in N is not the sID in N+1. Isn't that possible here as well?
Because overlap detection is based on successive records, sorted on patient, admission, and discharge, the algorithm will only detect and correct the first instance, leaving two overlaps: The next run, the following overlap is detected and corrected: Etc. This is the original method uses the iterations, and I think this could be done in the new method as well.
We might need to think of a way to also use a real dataset (or subset thereof) to validate the methods.
|
My "beautiful" schematic of overlapping admission didn't align there... But I think you get the idea. |
Hi,
I tried to think about the different types of possible overlaps, but I missed this type. I now see what is the issue... I usually try to avoid 'while' loops as much as possible, but here it seems we don't have much choice indeed
Yes, I think it is a key point, to make sure the function deals correctly with every possible scenario. I wrote tests in
I'm not sure I understand what you mean by auxiliary data. If by that you mean additional columns in the database, I don't understand where is the issue
Yes this is interesting, I missed that. I don't see why it would not be possible here as well. I am currently on the road, I will look into it ASAP. |
It seems to me we need to remain on @tjibbed original implementation, with the loop. Am I right ? does it need further optimization ? Can we merge @ClementMassonnaud data.table optimization into @tjibbed implementation ? or should we simply close this issue ? |
Yes we do need the loop I think |
As @ClementMassonnaud said, we need to look at the new version of the adjust_overlapping_stays() function proposed with commit #50200db
"[...] the two versions give the same results on all the tests I wrote. Considering the current version though, I'm not sure to understand the purpose of the iterator and how it works. Because I created a fake database with all the types of overlapping stays I could think of, with overlapping stays nested multiple times, etc. And it seems that the function adjusts for those stays correctly with only one pass in the while loop... Maybe you can take a look at the tests and see if I'm missing something.
Anyway, the new function works quite similarly to the current one, but with a slight change that allows to adjust partial overlaps with either admission or discharge leading. Also, it's using a bit more of the useful data.table syntax which makes the function shorter.
If the new function is not missing anything important that I might have missed, I would propose to use it as a default, mainly for the two advantages I mentioned."
The text was updated successfully, but these errors were encountered: