-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: Add optional dependencies #241
Conversation
Change intended optional dependencies to really be optional Chore COG-595
Warning Rate limit exceeded@dexters1 has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 2 minutes and 27 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. WalkthroughThe pull request introduces changes to several workflow files for testing Neo4j, Qdrant, and Weaviate, updating dependency installation commands to include specific extras for each database. Additionally, the Changes
Possibly related PRs
Suggested reviewers
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
Resolve issue with forced import of optional pgvector package. Fix COG-595
Add install of optional databases packages for database gh actions Fix COG-595
Switched falkordb and posthog to be optional dependencies Chore COG-595
Update README.md with database optional dependencies Chore COG-595
Added langchain as optional dependency Chore COG-595
Update readme to be a bit more descriptive about installation Chore COG-595
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (3)
cognee/infrastructure/databases/vector/pgvector/PGVectorAdapter.py (1)
Line range hint
70-85
: Avoid duplicating thePGVectorDataPoint
class definitionThe
PGVectorDataPoint
class is defined within both thecreate_collection
andcreate_data_points
methods. This leads to code duplication and may cause maintenance challenges. Consider defining the class once, possibly at the class level or in a separate module, and reuse it across methods.Apply this diff to refactor the class definition:
+from pgvector.sqlalchemy import Vector +class PGVectorDataPoint(Base): + __tablename__ = None # Will be set dynamically + __table_args__ = {"extend_existing": True} + # PGVector requires one column to be the primary key + primary_key: Mapped[int] = mapped_column( + primary_key=True, autoincrement=True + ) + id: Mapped[Any] + payload = Column(JSON) + vector = Column(Vector(None)) + def __init__(self, id, payload, vector): + self.id = id + self.payload = payload + self.vector = vectorThen, within your methods, set the appropriate attributes before using the class:
def create_collection(self, collection_name: str, payload_schema=None): PGVectorDataPoint.__tablename__ = collection_name vector_size = self.embedding_engine.get_vector_size() PGVectorDataPoint.vector.type.dim = vector_size # proceed with the rest of the method def create_data_points(self, collection_name: str, data_points: List[DataPoint]): PGVectorDataPoint.__tablename__ = collection_name vector_size = self.embedding_engine.get_vector_size() PGVectorDataPoint.vector.type.dim = vector_size # proceed with the rest of the methodpyproject.toml (1)
92-92
: Clarify the inclusion ofnotebook
in both dev dependencies and extrasThe
notebook
package is marked as optional in the[tool.poetry.group.dev.dependencies]
section and is also included in the[tool.poetry.extras]
section. This might lead to confusion about its intended usage. Consider:
- If
notebook
is only for development purposes, keep it indev.dependencies
.- If it's meant to be an optional feature for users, include it in
extras
.- If both, ensure documentation clarifies its dual role.
README.md (1)
Line range hint
386-392
: Enhance database compatibility informationConsider adding the following improvements to the implementation state table:
- Include minimum supported versions for each database
- Add known limitations or requirements for stable implementations
- Remove the HTML
<style>
tag as it's not standard markdown and may not work on all platformsHere's a suggested format:
- <style> - table { - width: 100%; - } - </style> - - | Name | Type | Current state | Known Issues | + | Name | Type | Current state | Supported Version | Known Issues |
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
⛔ Files ignored due to path filters (1)
poetry.lock
is excluded by!**/*.lock
📒 Files selected for processing (7)
.github/workflows/test_neo4j.yml
(1 hunks).github/workflows/test_qdrant.yml
(1 hunks).github/workflows/test_weaviate.yml
(1 hunks)README.md
(3 hunks)cognee/infrastructure/databases/vector/pgvector/PGVectorAdapter.py
(2 hunks)cognee/infrastructure/databases/vector/pgvector/create_db_and_tables.py
(0 hunks)pyproject.toml
(3 hunks)
💤 Files with no reviewable changes (1)
- cognee/infrastructure/databases/vector/pgvector/create_db_and_tables.py
🔇 Additional comments (6)
cognee/infrastructure/databases/vector/pgvector/PGVectorAdapter.py (1)
Line range hint 109-125
: Avoid duplicating the PGVectorDataPoint
class definition
This is a duplicate of the earlier comment regarding the PGVectorDataPoint
class being defined within methods. Refactoring as suggested will improve code maintainability.
.github/workflows/test_qdrant.yml (1)
50-50
: Include qdrant
extra in dependency installation
Adding the -E qdrant
option ensures that the optional Qdrant dependencies are installed, which is necessary for the integration tests to run properly.
.github/workflows/test_weaviate.yml (1)
50-50
: Include weaviate
extra in dependency installation
Including the -E weaviate
option in the installation command ensures that the Weaviate-specific dependencies are installed, enabling the integration tests to execute correctly.
.github/workflows/test_neo4j.yml (1)
49-49
: Include neo4j
extra in dependency installation
Adding -E neo4j
to the poetry install command ensures that the Neo4j optional dependencies are included, which is necessary for the Neo4j integration tests.
pyproject.toml (1)
33-33
: Optional dependencies correctly marked and extras defined
The dependencies like falkordb
, qdrant-client
, weaviate-client
, neo4j
, langchain_text_splitters
, langsmith
, posthog
, asyncpg
, and pgvector
have been marked as optional, and corresponding extras have been defined. This allows users to install only the necessary dependencies based on their needs.
Also applies to: 46-46, 49-49, 52-52, 56-57, 59-59, 69-70, 80-81, 83-84
README.md (1)
Line range hint 23-85
: Consider reordering database installation sections based on stability status
The installation sections show PostgreSQL first, but according to the implementation state table, it's marked as unstable. Consider reordering the sections to show stable implementations (Qdrant, Weaviate, Neo4j) first, followed by unstable ones (PostgreSQL) to better guide users toward recommended options.
Let's verify the stability status of each database integration:
Added groq and langfuse as optional dependencies Chore COG-595
…s/cognee into COG-595-optional-dependencies
Change intended optional dependencies to really be optional
List of optional dependencies added in PR:
qdrant,
neo4j,
weaviate,
falkordb,
langchain,
jupyter notebook,
posthog,
groq,
langfuse
Chore COG-595
Summary by CodeRabbit
Release Notes
New Features
Bug Fixes
Chores
pyproject.toml
to allow for optional dependencies, enhancing flexibility.