-
Notifications
You must be signed in to change notification settings - Fork 584
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add data-classification.md extension #1317
Merged
duglin
merged 9 commits into
cloudevents:main
from
rob-sessink:feature/data-classification.md
Dec 12, 2024
Merged
Changes from 6 commits
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
f102308
Add data-classification.md extension
5d89578
FIX based upon PR comments: correct spelling, add link in extensions/…
db7fcda
FIX based upon PR comments: improve spelling
b22870d
FIX based upon PR comments: improve description around recommended la…
7d3f27b
FIX based upon PR comments: improve wording and usage of notational c…
de8f7a5
FIX: add missing 'of'
a1b3ae7
FIX based upon PR comments: extend usage section to state expectation…
c4c2ca1
FIX: must -> MUST
c280ffd
FIX based upon PR comments: in Usage section change 'ignore event' in…
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,90 @@ | ||
# Data Classification Extension | ||
|
||
CloudEvents might contain payloads which are subjected to data protection | ||
regulations like GDPR or HIPAA. For intermediaries and consumers knowing how | ||
event payloads are classified, which data protection regulation applies and how | ||
payloads are categorized, enables compliant processing of events. | ||
|
||
This extension defines attributes to describe to | ||
[consumers](../spec.md#consumer) or [intermediaries](../spec.md#intermediary) | ||
how an event and its payload is classified, category of the payload and any | ||
applicable data protection regulations. | ||
|
||
These attributes are intended for classification at an event and payload level | ||
and not at a `data` field level. Classification at a field level is best defined | ||
in the schema specified via the `dataschema` attribute. | ||
|
||
## Notational Conventions | ||
|
||
As with the main [CloudEvents specification](../spec.md), the key words "MUST", | ||
"MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", | ||
"RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as | ||
described in [RFC 2119](https://tools.ietf.org/html/rfc2119). | ||
|
||
However, the scope of these key words is limited to when this extension is used. | ||
For example, an attribute being marked as "REQUIRED" does not mean it needs to | ||
be in all CloudEvents, rather it needs to be included only when this extension | ||
is being used. | ||
|
||
## Attributes | ||
|
||
### dataclassification | ||
|
||
- Type: `String` | ||
- Description: Data classification level for the event payload within the | ||
context of a `dataregulation`. In situations where `dataregulation` is | ||
undefined or the data protection regulation does not define any labels, then | ||
RECOMMENDED labels are: `public`, `internal`, `confidential`, or | ||
`restricted`. | ||
- Constraints: | ||
- REQUIRED | ||
|
||
### dataregulation | ||
|
||
- Type: `String` | ||
- Description: A comma-delimited list of applicable data protection regulations. | ||
For example: `GDPR`, `HIPAA`, `PCI-DSS`, `ISO-27001`, `NIST-800-53`, `CCPA`. | ||
- Constraints: | ||
- OPTIONAL | ||
- if present, MUST be a non-empty string without internal spaces. Leading and | ||
trailing spaces around each entry MUST be ignored. | ||
|
||
### datacategory | ||
|
||
- Type: `String` | ||
- Description: Data category of the event payload within the context of a | ||
`dataregulation` and `dataclassification`. For GDPR personal data typical | ||
labels are: `non-sensitive`, `standard`, `sensitive`, `special-category`. For | ||
US personal data this could be: `sensitive-pii`, `non-sensitive-pii`, | ||
`non-pii`. And for personal health information under HIPAA: `phi`. | ||
- Constraints: | ||
- OPTIONAL | ||
- if present, MUST be a non-empty string | ||
|
||
## Usage | ||
|
||
When this extension is used, producers MUST set the value of the | ||
`dataclassification` attribute. When applicable the `dataregulation` and | ||
`datacategory` attributes MAY be set to provide additional details on the | ||
classification context. | ||
|
||
When an implementation supports this extension, then intermediaries and | ||
consumers MUST take these attributes into account and act accordingly to data | ||
regulations and/or internal policies in processing the event and payload. If | ||
intermediaries or consumers cannot meet such requirements, they MUST reject or | ||
jskeet marked this conversation as resolved.
Show resolved
Hide resolved
|
||
ignore the event. | ||
|
||
Intermediaries SHOULD NOT modify the `dataclassification`, `dataregulation`, and | ||
jskeet marked this conversation as resolved.
Show resolved
Hide resolved
|
||
`datacategory` attributes. | ||
|
||
## Use cases | ||
|
||
Examples where data classification of events can be useful are: | ||
|
||
- When an event contains PII or restricted information and therefore processing | ||
by intermediaries or consumers need to adhere to certain policies. For example | ||
having separate processing pipelines by sensitivity or having logging, | ||
auditing and access policies based upon classification. | ||
- When an event payload is subjected to regulation and therefore retention | ||
policies apply. For example, having event retention policies based upon data | ||
classification or to enable automated data purging of durable topics. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
# Data Classification Extension | ||
מסמך זה טרם תורגם. בבקשה תשתמשו [בגרסה האנגלית של המסמך](../../../extensions/data-classification.md) לבינתיים. |
6 changes: 6 additions & 0 deletions
6
cloudevents/languages/zh-CN/extensions/data-classification.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
# Data Classification Extension | ||
|
||
本文档尚未被翻译,请先阅读英文[原版文档](../../../extensions/data-classification.md) 。 | ||
|
||
如果您迫切地需要此文档的中文翻译,请[提交一个issue](https://github.com/cloudevents/spec/issues) , | ||
我们会尽快安排专人进行翻译。 |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize this potentially goes down a rabbit-hole of trying to maintain catalogs but is there value is formalizing some of the regulation codes or referencing some well-known external catalog (if one exists).
In addition, does the applicability of some of these regulations vary by jurisdiction? if so, does that need to be represented in some fashion ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have been searching for catalog of data protection regulations, but I could not find a definitive source which also standardizes regulation codes. The best open source (outside some commercial legal websites) is that of UNCTAD - Data Protection and Privacy Legislation Worldwide. Here an interactive overview and Excel dataset of cyber-laws (incl. data protection and privacy) across the world is provided. However, this does not define any (standardized) regulation codes.
I would not want to go that far of deriving and maintaining a catalog of data regulation codes. For often referenced regulations (or standards) of countries/regions, de facto abbreviations are available, so in my view that could suffice. Would a small appendix with commonly used regulation codes including a reference to UNCTAD website be useful?
Overall, I see this usage of this attribute like other context attributes like
source
andsubject
. The semantics of the values are based upon mutual understanding between producer and consumer and I feel this extension should not be to prescriptive.Unsure about the applicability question. I think this can vary per regulation/country. For example, in the case of GDPR, this regulation is applicable for organizations within EU countries but also for organizations outside of EU countries when targeting EU citizens. I doubt if this must be represented in some attribute and what direct value this adds. In my view, it is an agreement between producers and consumers of CloudEvents on how a data-classification label and applicable regulation is interpreted and how this influences processing. I would not want to define this much to more detail in this extension.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JemDay any thoughts on Rob's reply? I'd like to see if we can close this one out by next week (our last call this year).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JemDay would this be helpful or too much?
Appendix Data Protection and Privacy Regulations
A catalog of common data protection and privacy regulation and abbreviations
based upon UNCTAD (United Nations Conference on Trade and Development)
information. As UNCTAD itself does not define any abbreviations, this
is a non-exhaustive derivative list of most common regulations. For more
information see UNCTAD Data Protection and Privacy Legislation Worldwide.