You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, Apache Druid doesn’t support restricting access to the lookup tables. This might be problematic when using Druid in a multitenant environment. It might expose sensitive information and break companies' internal policies. Our motivation is to enable Druid users to set permissions for the lookup tables.
Current findings & proposals
The main entry point for query handling is the QueryLifecycle class. Authorization is handled in the method public Access authorize(HttpServletRequest req) where permissions are modeled as a set of ResourceAction objects.
The method authorize() generates the resource actions for all tables that a query refers to in the following lines:
Finally, the DataSource class specifies what qualifies as a table name. There is a comment that clearly states that lookups are not included in the list of table names that a query generates -”Returns the names of all table datasources involved in this query. Does not include names for non-tables, like lookups or inline datasources.”
However, In the @JsonSubType declarations, a LookupDataSource is listed. When we checked the LookupDataSource class, which would be instantiated for queries like SELECT * FROM lookups.mylookup, we found that it returns an empty list of table names:
public Set<String> getTableNames() {
return Collections.emptySet();
}
So currently, neither inline use of LOOKUP() calls nor querying the lookup tables directly can be secured in Druid.
Would modifying the LookupDataSource class to return the injected table name via getTableNames() be sufficient to enforce restrictions on lookup tables and treat them as queryable data sources?
To be discussed
Is there a compelling reason for still excluding the lookups in the access checks? Wouldn’t it be easy to include all of them into a single rule to permit access since all lookups are arranged into the same schema (lookups.*)?
The text was updated successfully, but these errors were encountered:
Motivation
Currently, Apache Druid doesn’t support restricting access to the lookup tables. This might be problematic when using Druid in a multitenant environment. It might expose sensitive information and break companies' internal policies. Our motivation is to enable Druid users to set permissions for the lookup tables.
Current findings & proposals
The main entry point for query handling is the
QueryLifecycle
class. Authorization is handled in the methodpublic Access authorize(HttpServletRequest req)
where permissions are modeled as a set ofResourceAction
objects.The method
authorize()
generates the resource actions for all tables that a query refers to in the following lines:Finally, the
DataSource
class specifies what qualifies as a table name. There is a comment that clearly states that lookups are not included in the list of table names that a query generates -”Returns the names of all table datasources involved in this query. Does not include names for non-tables, like lookups or inline datasources.”However, In the
@JsonSubType
declarations, aLookupDataSource
is listed. When we checked theLookupDataSource
class, which would be instantiated for queries likeSELECT * FROM lookups.mylookup
, we found that it returns an empty list of table names:So currently, neither inline use of
LOOKUP()
calls nor querying the lookup tables directly can be secured in Druid.Would modifying the
LookupDataSource
class to return the injected table name viagetTableNames()
be sufficient to enforce restrictions on lookup tables and treat them as queryable data sources?To be discussed
Is there a compelling reason for still excluding the lookups in the access checks? Wouldn’t it be easy to include all of them into a single rule to permit access since all lookups are arranged into the same schema (lookups.*)?
The text was updated successfully, but these errors were encountered: