Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add startTransfer() and transferCounter to DeviceModule #131

Closed
9 tasks
killenb opened this issue Apr 3, 2020 · 4 comments
Closed
9 tasks

Add startTransfer() and transferCounter to DeviceModule #131

killenb opened this issue Apr 3, 2020 · 4 comments

Comments

@killenb
Copy link
Member

killenb commented Apr 3, 2020

Child of #123
Implement after #129

The exception handling scheme requires a mechanism that prevents the ExceptionHandlingDecorators to start a transfer if the device is in error state. Especially after opening the device, where the accessors are already functional, no access must happen before the initialisation sequence is done and the recovery accessors have been written.

This ticket provides the interface in the DeviceModule to be able to implement this in the ExceptionHandlingDecorator.

Scheme

  • To synchronise, the variable deviceHasError is used.
    • It is set to true in reportException, which is called by the ExceptionHandlingDecorator in its thread (already there).
    • It is set false by the DeviceModule after successful recovery (already there).
  • There is a function waitForRecovery() which allows to wait for the flag to change.
  • For efficient waiting there a two condition variables errorIsResolvedCondVar and errorIsReportedCondVar, and a common mutex errorMutex. The common predicate for the condition variables is deviceHasError.
  • A counter in the DeviceModule is tracking how many transfers are running. Only when all transfers have finished, the recovery can take place.
  • A function startTransfer() is increasing the counter if the device is not in error state. As
    the deviceHasError must only be accesses while holding the mutex, also increasing the counter must happen under the mutex to ensure without race condition that no transfer is started while the device is not OK.
  • The counter has to be decreased without acquiring the mutex. That's why it has to be atomic.

Implementation

  • Add a protected atomic integer "transferCounter" to the DeviceModule
  • Limit the code where the errorMutex is locked. It only must happen
    • when the deviceHasError variable is accessed
    • when the error reporting queue is accessed
    • when the transfer counter is increased.
  • Check that the deviceHasError variable is set true and false in the correct places (spec 2.1, 2.3.6 and (FIXME 2.6.1 is too late. It has to happen in 2.5.1)
  • Implement a function bool startTransfer() as described above.
    • It does not block (except for the short time while its waiting for the lock).
    • It checks that the application is in LifeCycleState::run and returns false if not. It does not try to acquire the lock or increase the counter in this case. (see ExceptionHandlingDecorator doPreRead() and doPreWrite())
    • It gets the error lock and checks deviceHasError
      • if the device is OK it increases the counter while holding the lock and then returns true
      • It returns false if the device is not ok. The counter is not increased.
  • Add a convenience function void stopTransfer().
    • It has an assertion that the counter is larger than 0 to simplify debugging
    • It just decreases the counter without lock.
  • The DeviceModule loop waits until counter goes to 0 after having the the exception, but before starting the recovery.
  • Change the ExceptionHandlingDecorator
    • preWrite() has to uses the return value of startTransfer() to set the transferAllowed variable
    • in postWrite() call stopTransfer if the transfer was allowed (must not be called if there was no transfer
    • Change the place where waitForRecovery() is called. This currently is inside the catch-block which can only be reached if a transfer was allowes. In the current implementation this is OK because the transfer is always allowed if the application state is Running. It has to be moved out of the transferAllowed check and always be executed if the appication state is Running.
    • The implementation for reading is more complicated because of the recovery which is going on in postRead(). This is done in Proper implementation of RecoveryAccessor read operations #138.

Definition of done

@killenb
Copy link
Member Author

killenb commented Apr 14, 2020

SharedLock does not work. I would have to set the unique lock which is held inside the DeviceModuleThread when reporting the exception from another thread. The only solution is a condition variable, which is already there: deviceHasError

@killenb killenb changed the title Add getSharedTransferLock and transferCounter to DeviceModule Add startTransfer() and transferCounter to DeviceModule Apr 14, 2020
@killenb
Copy link
Member Author

killenb commented Apr 14, 2020

SharedLock does not work. I would have to set the unique lock which is held inside the DeviceModuleThread when reporting the exception from another thread. The only solution is a condition variable, which is already there: deviceHasError

I updated the ticket description accordingly. Note that the specification is not 100 % up to date. It still mentions a shared lock.

@killenb
Copy link
Member Author

killenb commented Apr 15, 2020

Notice: The tests will only succeed if #130 is implemented. This ticket does otherwise not depend on it, so it can already be started.

@killenb
Copy link
Member Author

killenb commented Jul 30, 2020

The functions startTransfer() and stopTransfer(). The concept has been put into the specification and has been implemented.

@killenb killenb closed this as completed Jul 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants