Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Epic] Auth Schema #32

Closed
7 of 15 tasks
nelsonic opened this issue Feb 28, 2019 · 1 comment
Closed
7 of 15 tasks

[Epic] Auth Schema #32

nelsonic opened this issue Feb 28, 2019 · 1 comment
Assignees
Labels
awaiting-review An issue or pull request that needs to be reviewed enhancement New feature or enhancement of existing functionality epic A feature idea that is large enough to require a sprint (5 days) or more and has smaller sub-issues. in-progress An issue or pull request that is being worked on by the assigned person T1d Time Estimate 1 Day technical A technical issue that requires understanding of the code, infrastructure or dependencies

Comments

@nelsonic
Copy link
Member

nelsonic commented Feb 28, 2019

On Monday we drew out the basic auth schema on the witeboard in the Office together:

auth-schema-whiteboard-diagram

This issue/epic is our attempt to capture as much detail as we can so we can implement it.
The schema defined herein is not meant to be "set in stone", rather it's a starting point which will allow us to ship and then iterate as new requirements surface. :shipit:

Schema/Table/Record Naming?

While the photo of the schema above uses the word "users", this was only for the purposes of "sketching" it out on the whiteboard and is not the table/schema name we are going to use.
For more detail on why we are choosing to name the schema people, see: dwyl/app#33

people 👤

At the most basic level, we need a table to store encrypted email addresses and hashed passwords to allow a person to register and login to the app/service where auth is being used.

Schema

  • inserted_at - (standard ecto/phoenix schema type: :naive_datetime) when a record is inserted. Automatically set by the database which ensures record integrity.

  • cid (Primary Key. Ecto type :string) - the hash of the data being inserted (the Ecto changeset minus the inserted_at timestamp) such that we know that a record is unique and verifiable.
    This will use our excid implementation: https://github.com/dwyl/cid

  • person_id (Ecto type :string) - this is a unique id for the person which allows queries to be performed across the table. For example, the following query will return the latest version of person record: SELECT * FROM people WHERE person_id = 'e096' ORDER BY inserted_at DESC LIMIT 1;

  • prev (Ecto type :string) - the reference to the previous cid for the record.
    This is "metadata" included by alog which the person using the app will never see but is useful in an "audit trail" if we ever need to determine if a "rogue" DBA has altered records. If you haven't had exposure to data fraud/forensics see: wikipedia.org/wiki/Forensic_data_analysis

  • email_encrypted (Ecto type :binary) - encrypted email address which can be decrypted if/when we need to send someone an email from our App. This will use Fields.EmailEncrypted to transparently apply strong encryption to the data before storing it.

  • email_hash (Ecto type :binary. Uses Fields.EmailHash. Unique Index) - a (per application) salted sha256 hash of the person's email address which allows for fast lookup when the person is attempting to authenticate. This will use Fields.EmailHash to transparently hash the email address. For more detail on why this is needed and how it works,
    see: https://github.com/dwyl/phoenix-ecto-encryption-example

  • verified (Ecto type :naive_datetime) - the timestamp when the email address was verified. If this field is null (not yet set) it means that the email address has not been verified. This will appear on the person's Profile as email unverified and limit their actions in the App.
    For example: on CS it will mean their reviews of a drinks/venues will only be visible to admins/editors until the email address is verified. This is to avoid spam content.

  • password_hash - (Ecto type :binary, uses Fields.Password) - the Argon2 hash of the person's password, safely stored to avoid any risk of reversal in the unlikely event of a "database breach".

  • first_name - (Ecto type :binary, uses Fields.Encrypted) the person's name. How we address them in emails and in UI customisations.

Notes: since we are never overwriting data in our append-only log, updated_at is not needed.
Also, you may notice that other schemas in our App have a created_at field to capture when a record was created (the timestamp when a task/timer that was created while the person using the app was offline...) this is not relevant to Auth because all changes to the Auth tables must be performed while the person is online.

people Table Example (Encrypted/Hashed)1

In the following example, the person record "progresses"2 through various stages, these correspond to the row number:

  1. anonymous - the person is using the app or website anonymously. That's "ok". They can still perform certain actions, they just won't be able to "save" them or receive updates. The UI should prompt them to "register to save your preferences".
  2. initial registration - the person inputs their email address to "register" to use the App. The App simply captures their email address and sends them an email requesting them to click a link to verify that they own that email address.

When the person first registers they will not have had an previous contact with the App,
therefore there is no prev (previous cid) to be linked back to.

  1. verification - the person clicks the link sent to them by email and verifies their email address.

Notice how the prev (previous hash) in the verification step refers to the cid in the initial registration step e096d1004 this referential integrity means that we know the data is valid (and has not been tampered with!)

  1. password creation - the person defines a password so they can re-authenticate.
  2. naming - the person tells the App their name so the interface can be personalised.3
  3. aliasing their person_id - the person wants to have a specific "alias" in the App, so they update their person_id to their desired handle.
row inserted_at cid (PK) person_id prev email_encrypted email_hash verified password_hash first_name
1 1541609554 9cC4os9MH 9c null null null null null null
2 1541609554 e096d1004 9c 9cC4os9MH 1BA6546A0C5e7 bmt3d2KpD null null null
3 1541609876 ab4362a37 9c e096d1004 1BA6546A0C5e7 bmt3d2KpD 1541609876 null null
4 1541610203 MqEzBmto9 9c ab4362a37 1BA6546A0C5e7 bmt3d2KpD 1541609876 wKIGu6djDt null
5 1541611381 tjGbE5BUd 9c MqEzBmto9 1BA6546A0C5e7 bmt3d2KpD 1541609876 wKIGu6djDt NzKibHfx
6 1541612984 ktBcXWZtp alex tjGbE5BUd 1BA6546A0C5e7 bmt3d2KpD 1541609876 wKIGu6djDt NzKibHfx

1 values for cid, person_id, email_encrypted, email_hash, password_hash and prev have all been truncated for brevity in this table (to reduce scrolling). In the real app, we would store the full hash, encrypted blob or CID for these fields/columns. The only exception to this is person_id which is a temporary value that the person can change e.g from 9c to alex.

2 If you feel the "progression" of the person record is a "wasteful" way of capturing the data, remember that we are not concerned with how much disk space a record (or the evolution of a record) takes up, data is cheap! The value of the insight we derive from analytics far outweighs the cost of storing the extra data. Ask Amazon, Facebook or Google if they worry about data storage costs when capturing incremental analytics data ... they don't! They capture everything!
If anyone else using the Auth module/app in their own project feels they need to "clear out" older records, they can easily run a batch process to free up the disk space.

3If the UX designer prefers to capture both the person's email address, first name and password in a single registration form, they can! The Auth.people table will accept any combination of data to be inserted at any time. This means that registration can be easily A/B tested! 🆎

## Anonymous

More Schemas/Tables?

As you noticed, the tables from the Whiteboard photo (above) are not all included in this issue.
This is because we don't want this issue to be enormous so instead we are splitting out the remaining tables into linked sub-issues. And, in the case of Analytics a sub-project: https://github.com/dwyl/atm

Todo

  • Finish defining how alog stores and retrieves records.
  • Use alog and fields to create the people schema.
  • Ship! :shipit:
@nelsonic nelsonic added enhancement New feature or enhancement of existing functionality epic A feature idea that is large enough to require a sprint (5 days) or more and has smaller sub-issues. awaiting-review An issue or pull request that needs to be reviewed technical A technical issue that requires understanding of the code, infrastructure or dependencies labels Feb 28, 2019
@nelsonic nelsonic mentioned this issue Mar 1, 2019
13 tasks
@nelsonic nelsonic added T1d Time Estimate 1 Day in-progress An issue or pull request that is being worked on by the assigned person labels Feb 2, 2020
@nelsonic
Copy link
Member Author

Auth ERD with Roles and Permissions:

auth-erd-with-roles-permissions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting-review An issue or pull request that needs to be reviewed enhancement New feature or enhancement of existing functionality epic A feature idea that is large enough to require a sprint (5 days) or more and has smaller sub-issues. in-progress An issue or pull request that is being worked on by the assigned person T1d Time Estimate 1 Day technical A technical issue that requires understanding of the code, infrastructure or dependencies
Projects
None yet
Development

No branches or pull requests

6 participants