-
Notifications
You must be signed in to change notification settings - Fork 74
DLFS - A FileSystem API wrapper over dlog API #227
Conversation
Descriptions of the changes in this PR: reuse the methods used by `rename` to create missing path components. (the test is covered by #227) Author: Sijie Guo <[email protected]> Reviewers: Jia Zhai <None> This closes #228 from sijie/fix_create_log_pr
Descriptions of the changes in this PR: exclude `<default>` from listing logs (the tests are covered by #227) Author: Sijie Guo <[email protected]> Reviewers: Jia Zhai <None> This closes #229 from sijie/fix_listing_log_pr, closes #224
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very cool. A few minor comments.
confLocal.addConfiguration(dlConf); | ||
confLocal.setEnsembleSize(replication); | ||
confLocal.setWriteQuorumSize(replication); | ||
confLocal.setAckQuorumSize(replication); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack quorum == write quorum?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hdfs api has only one replication factor. so I am making ack quorum == write quorum to match similar behavior as hdfs.
Progressable progressable) throws IOException { | ||
// for overwrite, delete the existing file first. | ||
if (overwrite) { | ||
delete(path, false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what happens in the case that create fails but you managed to delete the file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
current approach doesn't guarantee any atomicity. this can be improved if dlog api supports opening a log stream in overwrite mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Create an issue for this - #231
} | ||
|
||
@Override | ||
public boolean truncate(Path f, long newLength) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can this even be supported without reading the whole log and rewriting? Is this used much in any case? Seems to me like a map reduce specific thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
truncate(f, newLength) is a posix filesystem semantic. I see it is used in multiple places -- when people want to delete potential-corrupted data from file (e.g. due to application crashes), they use truncate.
in dlog, we support truncate the head. in filesystem, truncate is truncating the tail. it is actually easy to support - whenever a truncate is issued, it should close active segment (ledger), based on the length to find what segments can be fully deleted and what segments can be partially deleted. for segments that can be fully deleted, just delete the ledgers and for segments that can be partially deleted, update the end position (dlsn) in the segments' metadata.
all the metadata status and mechanism are sorted of ready in dlog, since it supports truncate the head. adding truncate to tail can use those statuses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Create an issue for this - #232
@Override | ||
public void flush() throws IOException { | ||
try { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
empty line looks like you wanted to say more here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, those empty lines were left because I abstract the logic of writeControlRecord method. I will remove the empty lines.
@jiazhai can you review this again? |
} | ||
} | ||
|
||
// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems only truncate is not supported now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice catch. I will update the comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM.
|
||
writerOutputBufferSize=131072 | ||
|
||
numWorkerThreads=1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the reason to set this as "1" instead of the default value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh this configuration is used only for testing. so I set it to 1 thread.
addressed the comments. |
Descriptions of the changes in this PR:
(This is based on initial implementation from @gerritsundaram at #43)
Features supported:
Features aren't supported:
(This change includes small changes for #224 #225 #226 ).