-
-
Notifications
You must be signed in to change notification settings - Fork 4
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
6c1ca95
commit 2dcdfc3
Showing
1 changed file
with
36 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
NOTES FOR FUTURE REFERENCE | ||
|
||
The current model for searching by keyword is to use yake to extract all keywords from the question. | ||
After doing this, the code then stores all the keywords of the question in a database so that we dont need to | ||
re-extract the keywords from files we already parsed since extracting keywords is a costly operation. | ||
Although extraction is a costly operation, it is fast enough for the low amount of traffic ACMAS gets. | ||
One edge case that is possible is when two different context questions have the same keywords. | ||
Currently one solution would be to extract and match the keywords in the order they appear in the original | ||
question. | ||
|
||
|
||
|
||
|
||
One solution to solve the issue of overloading the server with extraction is to have the client extract the | ||
keywords in their questions instead. In this case, the clients would run the yake program in their browser and | ||
send their results to our servers. This would help offset the load to the clients. | ||
|
||
|
||
|
||
|
||
A solution to solve the edge case would be the keyword sorting solution | ||
|
||
Q1 Keywords: integrate lambda function C | ||
Q2 Keywords: integrate C lambda function | ||
|
||
These two questions have the same keywords so they would both show up when being queried even though they are two unrelated questions. | ||
|
||
If our search query was "How do I integrate lambda in C" the keywords of this query would be | ||
Query keywords: integrate lambda C | ||
|
||
If we were to match the key words in in order with the questions we would get | ||
|
||
Q1: integrate lambda C | ||
Q2: integrate lambda | ||
|
||
C gets skipped over because in our search query, lambda is the second keyword. |