Skip to content
This repository has been archived by the owner on Jun 20, 2023. It is now read-only.

Ursache: 2000 - etwas ist schiefgelaufen. #1214

Closed
kereng5 opened this issue Sep 23, 2020 · 16 comments
Closed

Ursache: 2000 - etwas ist schiefgelaufen. #1214

kereng5 opened this issue Sep 23, 2020 · 16 comments
Assignees
Labels
bug Something isn't working mirrored-to-jira This item is also tracked internally in JIRA

Comments

@kereng5
Copy link

kereng5 commented Sep 23, 2020

Avoid duplicates

I have searched for "2000" and found no matching issue.

Describe the bug

After opening CWA:

ursache2000

Sorry, I did not hit "DETAILS" because I supposed "Ursache: 2000" would already be well known.
(Note the spelling error "referto")

Expected behaviour

CWA should perform the check and show the green screen below.

Steps to reproduce the issue

  1. I Started the phone still plugged into the charger.
  2. Opened ENF first to see the number of lines in the log before the check, as I do every day.
  3. Opened CWA, saw "Ursache: 2000" and took the photo above.
  4. After OKAY I saw this, though there had been a successful check yesterday:
    nach_ok
  5. Had a look at App-Informationen to see if the version is still 1.3.0
  6. After returning:
    heute_8_37

Technical details

  • Mobile device: Motorola G3
  • Android version: 6.0.1
  • CWA version 1.3.0
  • ENF 16203315000
  • no "Priorisierte Hintergrundaktivität"

Additional context

The ENF log shows 14 lines all with 08:37.
So there was just the error message but no enduring problem.


Internal Tracking ID: EXPOSUREAPP-2871

@kereng5 kereng5 added the bug Something isn't working label Sep 23, 2020
@MikeMcC399
Copy link
Contributor

The string "something went wrong during a critical part of the application ensuring security, please refer" +
"to the details for more information" comes from https://github.com/corona-warn-app/cwa-app-android/blob/master/Corona-Warn-App/src/main/java/de/rki/coronawarnapp/exception/CwaSecurityException.kt . I don't know why this string is embedded in the code instead of being in https://github.com/corona-warn-app/cwa-app-android/blob/master/Corona-Warn-App/src/main/res/values/strings.xml etc. where it gets translated. Looking at the code it's also clear why there is no space between "refer" and "to".

The actual error 2000 seems to come from https://github.com/corona-warn-app/cwa-app-android/blob/master/Corona-Warn-App/src/main/java/de/rki/coronawarnapp/util/security/SecurityHelper.kt .

@vaubaehn
Copy link
Contributor

@d4rken FYI: see comments above. Related to #642 (comment) Tink? - Keystore busy?

@d4rken
Copy link
Member

d4rken commented Sep 23, 2020

Likely the same cause as #642

@heinezen
Copy link
Member

Hey @kereng5,

Thanks for your report. The issue has been forwarded to the developers and is tracked as Jira ticket EXPOSUREAPP-2871.

Regards,
CH


Corona-Warn-App Open Source Team

@d4rken
Copy link
Member

d4rken commented Oct 1, 2020

Revisiting this again, @kereng5 if you could keep using the app after seeing this error, then it is likely not #642, but more likely that the downloaded app config could not be verified. Not sure how it could happen though, download corrupted somehow?

@vaubaehn
Copy link
Contributor

vaubaehn commented Oct 1, 2020

@d4rken looking to the screenshots above:
Pic1 looks like SharedPreferences couldn't be read correctly, but somehow a valid (but wrong) value was used to create risk status 'not active long enough'. [if there was no issue with ESP, there shouldn't be 'unknown risk status'].
In Pic2 the correct duration of active exposure logging (14 days) could be read out of ESP.
Could there also be temporary/recoverable keystore problems?

@d4rken
Copy link
Member

d4rken commented Oct 2, 2020

I think the SharedPreferences can either be read completely or not at all as Tink would have overwritten our key 🤔. Once the preferences are loaded, it should not be possible to experience the issue without app restart. The tell tale sign I would expect to see is onboarding being shown again, at least if there was an app restart.

@vaubaehn
Copy link
Contributor

vaubaehn commented Oct 2, 2020

Hi @d4rken ,
my understanding of the whole procedure is this (when there is no error):

CWA wants to read a special value from a key/value-pair in ESP -> API call androidx.security -> Tink calls AndroidKeyStore for de-/encryption key -> SharedPreferences are opened -> all keys (of the key/value-pairs) in SharedPreferences are decrypted -> key found: value is decrypted -> value is returned to CWA in the end.

CWA wants to write a special value to a key/value-pair in ESP -> API call androidx.security -> Tink calls AndroidKeyStore for de-/encryption key -> SharedPreferences are opened -> all keys (of the key/value-pairs) in SharedPreferences are decrypted -> key found: value is stored encrpted - OR: key not found: key/value-pair is stored encrypted -> SharedPreferences are saved and closed.

Is this assumption plausible?

According to this comment tink-crypto/tink#321 (comment) , there are two options in case of error in the currently used version: either a new key/value-pair is created, or the master key (from AndroidKeyStore) for de-/encryption is changed.

EncryptedSharedPreferences then would only be not accessible at all, if the master key was changed. I guess, that is the reason why CWA crashes in the beginning.

EncryptedSharedPreferences can be accessed, if the master key is still usable, but a single read from key/value-pair may fail. I think, this is what originally caused 9002: file is not a database, when key/value-pair for the db-password could not be read.
If ESP can still be accessed, but the onboarding-key/value-pair cannot be read, then onbording starts again, what we've also seen before, which then afterwards already can lead to a 9002: file is not a database.

So, in above case, it might have happened, that master key was still intact, but key/value-pair was not readable and it failed with cause 2000, but a new key/value-pair might have been created. Interesting question: if that new pair is created, but not overwritten, could it be possible, that the original pair can be read again, after the AndroidKeyStore becomes accessible again and decryption of the original values works fine? To examine this, the SharedPreferences.xml would need to be examined. If it contains more entries than values are needed for CWA, then this could have happened.

By the way: I think that the KeyStore busy problem does not only occur when several apps are accessing the KeyStore in the same time, but also when a single app uses the KeyStore frequently.
In this context, the GoogleQuoataCalculator v1 made heavily use of the ESP (and KeyStore), as it wrote several times to the SharedPrefs when counting the Api-Calls...
Another issue in this context: I'm not sure how reliable values can be written to SharedPrefs, when they're used like runtime variables (like counters), and they're saved with editor.apply. Even Google states editor.apply is best practice, I would assume that there might be inconsistencies in the saved values, when it's up to the OS to store the values into SharedPrefs when it thinks it's a good time to do... are we sure, that values are written in exactly the sequence we've changed them? So, I would defintively make use of editor.commit in this case. Or even better: read all key/value-pairs on OnCreate(), put them to global variables, change global variables on runtime, and save them back to SharedPrefs all at once, when the context is save, in other words: using the SharedPrefs like a store for preferences, and not (ab)use it, like they are variables in RAM.

Does it make sense?

@d4rken
Copy link
Member

d4rken commented Oct 2, 2020

I can't answer all of that reliably as I currently don't know enough.

We only create the EncryptedSharedPreferences object once, then re-use it. The code (at least what I understood), only uses Tink once to create the necessary instances based on the data in the KeyStore.

These comments seem to support that 1 and 2.

So I currently don't see how it could fail to only access part of the keys. You are right though that the initial 9002 error behavior without new onboarding somehow doesn't fit.

It's really a difficult issue to debug ☹️ .

I think that the KeyStore busy problem does not only occur when several apps are accessing the KeyStore in the same time, but also when a single app uses the KeyStore frequently.
In this context, the GoogleQuoataCalculator v1 made heavily use of the ESP (and KeyStore), as it wrote several times to the SharedPrefs when counting the Api-Calls...

I don't think it reaccesses the systems keystore for that again after it has already done so.

Another issue in this context: I'm not sure how reliable values can be written to SharedPrefs, when they're used like runtime variables (like counters), and they're saved with editor.apply. Even Google states editor.apply is best practice, I would assume that there might be inconsistencies in the saved values, when it's up to the OS to store the values into SharedPrefs when it thinks it's a good time to do... are we sure, that values are written in exactly the sequence we've changed them?

I have thought about that too, especially with our app woes of being killed in the background by battery optimizations.

Maybe it's a combination of apply and the busy keystore issue...

So, I would defintively make use of editor.commit in this case. Or even better: read all key/value-pairs on OnCreate(), put them to global variables, change global variables on runtime, and save them back to SharedPrefs all at once, when the context is save, in other words: using the SharedPrefs like a store for preferences, and not (ab)use it, like they are variables in RAM.

There will be an overhaul of our LocalData class and I'll consider changing everything to commit instead of apply at least for critical operations like storing a database password. Though I think removing the EncryptedSharedPreferences is also up for discussion. I don't think the additional headaches it is causing us is worth the extra layer of security, on top of the systems on-disk encryption, especially for mundane data such as "isOnboardingCompleted".

@vaubaehn
Copy link
Contributor

vaubaehn commented Oct 2, 2020

@d4rken
thanks for the nice discussion.

We only create the EncryptedSharedPreferences object once, then re-use it.

Ok, got it. Thx for the link.
Yes, looked into the comments, I would understand it like you. (one comment in that thread looked like, there is rather a similar project like ours here ;) )

I don't think it reaccesses the systems keystore for that again after it has already done so.

Makes sense in this context.

Maybe it's a combination of apply and the busy keystore issue...

Probably...

There will be an overhaul of our LocalData class and I'll consider changing everything to commit instead of apply at least for critical operations like storing a database password.

This sounds really good.
Also I read that you consider to implement ForegroundService for the retrieval of diagnosis keys (and also providing to ENF and waiting for the matching result?). I assume that this will relax this situation significantly due to a stable background task. All the time critical operations then have lots of time ;) (assuming that aggressive battery optimizations won't/can't kill foreground services). The only critcal component will then wait in front of the screen.

Though I think removing the EncryptedSharedPreferences is also up for discussion. I don't think the additional headaches it is causing us is worth the extra layer of security, on top of the systems on-disk encryption, especially for mundane data such as "isOnboardingCompleted".

This is probably the best solution. Some people call me obsessive about data privacy, but I can't find any critical information to protect additionally in a second layer at the moment.
Most sensible information is 'increased risk status' that can be fetched on runtime from ENF, and doesn't need to be stored.
The next is 'being infected' and uploading keys, what is protected by plausible deniability, but nothing is stored locally about it, afair.
Having registered a test (having stored a token) is completely uncritical, as thousands have, and it does not give much information about health state.

Ah... Only critical information would be a positive test result, but I don't know at the moment, if the result is stored/cached locally? Or is it always fetched from server on runtime?
And everything else is absolutely unimportant imho...

Anyway, thanks again, and I hope you find some time to relax this week-end.

@heinezen
Copy link
Member

heinezen commented Jan 5, 2021

Dear @kereng5 and community ,

Did the error happen to you again in the latest releases of the Corona Warn App?

Regards,
CH


Corona-Warn-App Open Source Team

@MikeMcC399
Copy link
Contributor

@heinezen
A secondary issue brought up by @kereng5 in the original post was the formatting of the error message with the missing space between the words "refer" and "to". That cosmetic issue has not been addressed.

In CwaSecurityException.kt#L8-L9

    "something went wrong during a critical part of the application ensuring security, please refer" +
            "to the details for more information",

could be reformatted to:

    "Something went wrong during a critical part of the application ensuring security, please refer " +
            "to the details for more information.",

for better readability if the error does reoccur.
(Capitalise "s" in "something", add space after "refer" and add full stop "." at the end of the error message.)

@kereng5
Copy link
Author

kereng5 commented Jan 5, 2021

@heinezen
It did not happen again in version 1.5 or later.

@heinezen
Copy link
Member

heinezen commented Feb 9, 2021

Closing this because we received positive feedback. @MikeMcC399 's remark about the error message was also fixed.


Corona-Warn-App Open Source Team

@heinezen heinezen closed this as completed Feb 9, 2021
@isaMol12
Copy link

Helly, I have the same issue "Ursache 2001" I already checked the certificates of safety but its all set...

@MikeMcC399
Copy link
Contributor

@isaMol12

I have the same issue "Ursache 2001" I already checked the certificates of safety but its all set...

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working mirrored-to-jira This item is also tracked internally in JIRA
Projects
None yet
Development

No branches or pull requests

7 participants