-
Notifications
You must be signed in to change notification settings - Fork 343
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shared schema link #1055
Shared schema link #1055
Conversation
89253bf
to
37d1759
Compare
…sing shared schema" This reverts commit 5a5c0d6.
37d1759
to
838ae76
Compare
838ae76
to
01715a1
Compare
let ns = self | ||
.inner | ||
.make_namespace | ||
.create( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why we create shared_schema_db here. Shouldn't it already exist here @Horusiath ?
If we're creating it for every link what will happen if we create 2 databases that link to the same shared schema? Wouldn't it be created twice then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why isn't it enough to just call let e = self.inner.store.get(shared_schema_db).await;
?
@LucioFranco please review the metastore changes |
} | ||
if config.config.is_shared_schema { | ||
let res: rusqlite::Result<u32> = guard.conn.query_row( | ||
"SELECT COUNT(1) FROM shared_schema_links WHERE shared_schema_name = ?", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the benefit of using COUNT(1) instead of COUNT(*) here?
I think the data model for schema links is wrong. It duplicates entries:
Something's not consistent here. We're using INSERT OR REPLACE but we never replace. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is the right approach. I think the linked schema should be part of the config, and then have a table/column to quickly lookup what table is using what schema, if any. This can be enforced by sqlite constraints as well.
#[derive(Debug, Default)] | ||
struct MetaState { | ||
config: Arc<DatabaseConfig>, | ||
shared_schemas: HashMap<NamespaceName, HashSet<NamespaceName>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we store that? and why here? this is some database-specific config; why does it have a map of the namespace to sets of namespaces?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a map from shared_schema -> set of dbs that use that shared schema.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's not the place to store that. One instance of that is created for every namespace. Why do we need that mapping in memory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be empty for everything but shared schema databases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i still don't understand why this needs to me in memory. Also, why is it is mapping namespaces to maps of namespaces? Are we storing all the shared schema databases to the databases that uses them? Why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, but:
- the shared schema is almost never accessed, it's not like you are doing 10 migrations per second
- it can get very big, if you have 100s of thousands of dbs for a single namespace
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another thing: something that we should do asap unload the config when the namespace is unloaded. Imagine the cost of reloading all the shared namespaces in memory every time that schema is accessed for one reason or another. This is potentially very long, and we most likely don't care about what db depend on that schema
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those are fair points. So instead of caching in memory we fetch the list from the meta store each time it's being used. That's what you're proposing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, might even be an iterator, it we don't need the whole list at once
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MarinPostma @haaawk for the context: this piece of code is made for compliance with HandleState::Internal
which - from what I've seen - is used only for testing purposes.
shared_schema_name TEXT NOT NULL, | ||
db_name TEXT NOT NULL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we create a references to the main table keys instead?
if let Some(shared_schema) = shared_schema_name { | ||
app_state | ||
.namespaces | ||
.shared_schema_link(shared_schema, namespace) | ||
.await?; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's not good, we could end up in a state with the db created and the not linked. This should be done in the same transaction as storing the config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be fair this is not worse than what we have already where we could end up with db being created but it's config missing, no @MarinPostma ?
WDYM @MarinPostma isn't @Horusiath creating such a table?
You meant what namespace is using what schema, right? |
I've applied all the comments here -> #1080 |
This PR depends on ##1049
It basically stores the relationship between shared schema db and databases linked to it:
shared_schema_links
table in the meta store of a shared-schema db.TODO: namespace forking (for linked db) also should store the link to shared schema, but this PR doesn't solve this yet.