Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement hard link support #95

Merged
merged 1 commit into from
Apr 30, 2016

Conversation

sruggier
Copy link
Contributor

This change adds support for hard links, and tests to go along with
that.

There is a bit of ugliness in this change around the use of the name
field on FakeFile instances. Support for hard links means that the same
file can have different names under different directories, so it
fundamentally doesn't make sense for FakeFile instances to have a name
associated with them.

However, I didn't want to have to perform a mass refactoring as a
prerequisite to making this change, so it instead just sets the filename
as needed to work with the existing implementation of AddObject. This
has the unavoidable side effect of changing the name associated with the
original file as well, since they share the same FakeFile instance.

It looks like this is probably safe to do now, but to properly ensure
that there aren't any problems related to this, the whole codebase
should be refactored to remove the name field from FakeFile, and replace
it with a suitable alternative. It may require a wrapper class that
helps callers keep track of a file's name while preventing misuse of
that information by direct users of FakeFile.

This change adds support for hard links, and tests to go along with
that.

There is a bit of ugliness in this change around the use of the name
field on FakeFile instances. Support for hard links means that the same
file can have different names under different directories, so it
fundamentally doesn't make sense for FakeFile instances to have a name
associated with them.

However, I didn't want to have to perform a mass refactoring as a
prerequisite to making this change, so it instead just sets the filename
as needed to work with the existing implementation of AddObject. This
has the unavoidable side effect of changing the name associated with the
original file as well, since they share the same FakeFile instance.

It looks like this is probably safe to do now, but to properly ensure
that there aren't any problems related to this, the whole codebase
should be refactored to remove the name field from FakeFile, and replace
it with a suitable alternative. It may require a wrapper class that
helps callers keep track of a file's name while preventing misuse of
that information by direct users of FakeFile.
@sruggier
Copy link
Contributor Author

I forgot to mention: this fixes #75.

"""
new_path_normalized = self.NormalizePath(new_path)
if self.Exists(new_path_normalized):
raise IOError(errno.EEXIST,
Copy link
Contributor

@jmcgeheeiv jmcgeheeiv Apr 28, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not necessarily a file. Also, let's give the full path so there is no mistake in the reader's mind,

'Fake file system object already exists at path', new_path_normalized

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, fair enough. I checked what ln reports, for inspiration, and they make the same mistake. I think "fake filesystem object" might be confusing for some peope. How about Fake file or directory already exists at path?

@jmcgeheeiv
Copy link
Contributor

jmcgeheeiv commented Apr 28, 2016

Let's go ahead and follow the lead of ln. You are right. "File or directory' is better than "Fake file system object".

I'm still trying to wrap my head around the ramifications of that old_file.name = new_basename thing.

@sruggier
Copy link
Contributor Author

From my attempt to review the codebase, I think the worst possible ramification right now is that some error messages could use the wrong filename, but that could change if someone tries to use that field for something more concrete than diagnostic output.


# abuse the name field to control the filename of the newly created link
old_file.name = new_basename
self.AddObject(new_parent_directory, old_file)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will currently also increase the file system size by the linked file size, as st_nlink is not taken into account in AddEntry(). I may fix this after the PR is merged back, as I added the file system size stuff only recently and could do some minor code cleanup there anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, the FS size stuff wasn't around when I first implemented this. I noticed it when I rebased, there was a merge conflict in the unlink implementation, but I didn't do anything beyond just resolving that conflict. It's probably not critical to fix this issue prior to merging, though, given that both the file sizing and hard link support are new.

You can also play with these commits before they're merged, if you add a line like this into your git config:

[remote "origin"]
...
    fetch = +refs/pull/*/head:refs/remotes/origin/pr/*

That will cause git fetch to make all open PRs available under origin/pr/.

It would make sense for me to update this myself, but I've been pretty busy in the last few weeks, and I'm not sure when I'd find time to get it done.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right that it's not critical, I can play with this this over the weekend if I find the time. I do not expect large changes to get it working, I will let you know when I'm done.

Copy link
Member

@mrbean-bremen mrbean-bremen Apr 29, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, the following patch to your PR fixes the problem (and contains some minor refactoring regarding ChangeDiskUsage()):

 pyfakefs/fake_filesystem.py | 48 ++++++++++++++++++++++-----------------------
 1 file changed, 23 insertions(+), 25 deletions(-)

diff --git a/pyfakefs/fake_filesystem.py b/pyfakefs/fake_filesystem.py
index 8e6b731..b0e9f59 100644
--- a/pyfakefs/fake_filesystem.py
+++ b/pyfakefs/fake_filesystem.py
@@ -267,12 +267,8 @@ class FakeFile(object):
                     self.name)
     if self.st_size:
       self.SetSize(0)
-    if self.filesystem and self.filesystem.total_size is not None:
-      if self.filesystem.GetDiskUsage().free < st_size:
-        raise IOError(errno.ENOSPC,
-                      'Fake file system: disk is full, failed to set file size to %r bytes' % st_size,
-                      self.name)
-      self.filesystem.ChangeDiskUsage(st_size)
+    if self.filesystem:
+      self.filesystem.ChangeDiskUsage(st_size, self.name)
     self.st_size = st_size
     self.contents = None

@@ -305,12 +301,8 @@ class FakeFile(object):
     current_size = self.st_size or 0
     self.contents = contents
     self.st_size = st_size
-    if self.filesystem and self.filesystem.total_size is not None:
-      if self.filesystem.GetDiskUsage().free < self.st_size - current_size:
-        raise IOError(errno.ENOSPC,
-                      'Fake file system: disk is full, failed to set contents to %r bytes' % self.st_size,
-                      self.name)
-      self.filesystem.ChangeDiskUsage(self.st_size - current_size)
+    if self.filesystem:
+      self.filesystem.ChangeDiskUsage(self.st_size - current_size, self.name)
     self.epoch += 1

   def SetContents(self, contents):
@@ -339,6 +331,7 @@ class FakeFile(object):

     Raises:
       IOError: if the st_size arg is not a non-negative integer
+               or if st_size exceeds the available file system space
     """

     if not isinstance(st_size, int) or st_size < 0:
@@ -348,12 +341,8 @@ class FakeFile(object):
                     self.name)

     current_size = self.st_size or 0
-    if self.filesystem and self.filesystem.total_size is not None:
-      if self.filesystem.GetDiskUsage().free < st_size - current_size:
-        raise IOError(errno.ENOSPC,
-                      'Fake file system: disk is full, failed to set file size to %r bytes' % st_size,
-                      self.name)
-      self.filesystem.ChangeDiskUsage(st_size - current_size)
+    if self.filesystem:
+      self.filesystem.ChangeDiskUsage(st_size - current_size, self.name)
     if self.contents:
       if st_size < current_size:
         self.contents = self.contents[:st_size]
@@ -410,8 +399,8 @@ class FakeDirectory(FakeFile):
       path_object:  FakeFile instance to add as a child of this directory
     """
     self.contents[path_object.name] = path_object
-    if self.filesystem:
-      self.filesystem.ChangeDiskUsage(path_object.GetSize())
+    if self.filesystem and path_object.st_nlink == 1:
+      self.filesystem.ChangeDiskUsage(path_object.GetSize(), path_object.name)

   def GetEntry(self, pathname_name):
     """Retrieves the specified child file or directory.
@@ -434,10 +423,10 @@ class FakeDirectory(FakeFile):
     Raises:
       KeyError: if no child exists by the specified name
     """
-    if pathname_name in self.contents and self.filesystem:
-      self.filesystem.ChangeDiskUsage(-self.contents[pathname_name].GetSize())
-
     entry = self.contents[pathname_name]
+    if self.filesystem and entry.st_nlink == 1:
+      self.filesystem.ChangeDiskUsage(-entry.GetSize(), pathname_name)
+
     entry.st_nlink -= 1
     assert entry.st_nlink >= 0

@@ -498,14 +487,23 @@ class FakeFilesystem(object):
       return DiskUsage(self.total_size, self.used_size, self.total_size - self.used_size)
     return DiskUsage(1024*1024*1024*1024, 0, 1024*1024*1024*1024)

-  def ChangeDiskUsage(self, usage_change):
+  def ChangeDiskUsage(self, usage_change, file_path):
     """Changes the used disk space by the given amount.

     Args:
       usage_change: number of bytes added to the used space
                     if negative, the used space will be decreased
+
+      file_path: the path of the object needing the disk space
+
+    Raises:
+      IOError: if usage_change exceeds the free file system space
     """
-    if self.used_size is not None:
+    if self.total_size is not None:
+      if self.total_size - self.used_size < usage_change:
+        raise IOError(errno.ENOSPC,
+                      'Fake file system: disk is full, failed to add %r bytes' % usage_change,
+                      file_path)
       self.used_size += usage_change

   def SetIno(self, path, st_ino):

I also had to disable some tests under Windows:

 fake_filesystem_test.py | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/fake_filesystem_test.py b/fake_filesystem_test.py
index 59063f9..b721ed5 100755
--- a/fake_filesystem_test.py
+++ b/fake_filesystem_test.py
@@ -1747,6 +1747,7 @@ class FakeOsModuleTest(TestCase):
     self.assertRaises(OSError,
                      self.os.link, '/nonexistent_source', '/link_dest')

+  @unittest.skipIf(TestCase.is_windows, 'no hard link support in Windows')
   def testLinkDelete(self):
     fake_open = fake_filesystem.FakeFileOpen(self.filesystem)

@@ -1797,6 +1798,7 @@ class FakeOsModuleTest(TestCase):
     self.assertRaises(OSError,
                      self.os.link, file1_path, breaking_link_path)

+  @unittest.skipIf(TestCase.is_windows, 'no hard link support in Windows')
   def testLinkCount1(self):
     """Test that hard link counts are updated correctly."""
     file1_path = 'test_file1'
@@ -3485,6 +3487,21 @@ class DiskSpaceTest(TestCase):
     self.os.rename('/foo/bar', '/foo/baz')
     self.assertEqual(20, self.filesystem.GetDiskUsage().used)

+  @unittest.skipIf(TestCase.is_windows, 'no hard link support in Windows')
+  def testThatHardLinkDoesNotChangeUsedSize(self):
+    file1_path = 'test_file1'
+    file2_path = 'test_file2'
+    self.filesystem.CreateFile(file1_path, st_size=20)
+    self.assertEqual(20, self.filesystem.GetDiskUsage().used)
+    # creating a hard link shall not increase used space
+    self.os.link(file1_path, file2_path)
+    self.assertEqual(20, self.filesystem.GetDiskUsage().used)
+    # removing a file shall not decrease used space if a hard link still exists
+    self.os.unlink(file1_path)
+    self.assertEqual(20, self.filesystem.GetDiskUsage().used)
+    self.os.unlink(file2_path)
+    self.assertEqual(0, self.filesystem.GetDiskUsage().used)
+

 if __name__ == '__main__':
   unittest.main()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a second look at os.link and noticed that since Python 3.2 it also supports hard links under Windows - I wasn't aware of this, as I'm still mostly using Python 2.7. I may have a look at this later, but I think it is easier to first merge back your changes.

@jmcgeheeiv jmcgeheeiv merged commit b9c7644 into pytest-dev:master Apr 30, 2016
@jmcgeheeiv
Copy link
Contributor

@mrbean-bremen, could I ask you to issue a PR with the changes in your patch?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants