Reported by Roger Demetrescu:
I've find an issue with ftputil 3.2 (well, it happens with 2.8 and 3.0 too) when trying to use host.makedirs() withing a FTP server that doesnt show any directories/files under de root ('/')
My customer gave me this directory to work:
/aaa/bbb/ccc
When I try to create this directory: ./ddd/eee/
FTPHost.makedirs() try
do create those directories:
/aaa
/aaa/bbb
/aaa/bbb/ccc
/aaa/bbb/ccc/ddd
/aaa/bbb/ccc/ddd/eee
Of course, /aaa already exists, and the ftputil does this check to see if it should reraise the exception:
if not self.path.isdir(next_directory):
The problem is: I am able to do a host.chdir('/aaa/bbb/ccc')
but if I
do a host.chdir('/')
and try to list the files and directories, it
returns an empty list.
Using FileZilla to connect to this FTP server gave me the same result. I
am able to see all files and directories under /aaa/bbb/ccc
but I see
nothing under the root: /
FTP is a Microsoft FTP Service (don't know any more details).
So, any suggestion on how to make ftputil stop trying to create these directories:
/aaa
/aaa/bbb
/aaa/bbb/ccc
and only try to create:
/aaa/bbb/ccc/ddd
/aaa/bbb/ccc/ddd/eee
since my current directory is already /aaa/bbb/ccc
?
Ok... I did some more tests and found out that I am not able to see any files/directories inside:
/aaa /aaa/bbb
I had to change FTPHost's makedirs implementation to this code:
# Ignore unused argument `mode` # pylint: disable=unused-argument def makedirs(self, path, mode=None): """ Make the directory `path`, but also make not yet existing intermediate directories, like `os.makedirs`. The value of `mode` is only accepted for compatibility with `os.makedirs` but otherwise ignored. """ path = ftputil.tool.as_unicode(path) path = self.path.abspath(path) directories = path.split(self.sep) old_dir = self.getcwd() try: # Try to build the directory chain from the "uppermost" to # the "lowermost" directory. for index in range(1, len(directories)): # Re-insert the separator which got lost by using `path.split`. next_directory = self.sep + self.path.join(*directories[:index+1]) try: self.chdir(next_directory) except ftputil.error.PermanentError: self.mkdir(next_directory) finally: self.chdir(old_dir)
It works like a charm... :)
I'm glad you could make it work - and saved me some brooding over the problem. ;-)
When seeing your solution I was thinking about whether it was a complete substitute (and hence could go into ftputil) or if it was failing for some other cases the current
makedirs
implementation covers. Particularly, could there be reasons forchdir
to fail, apart from the non-existence of the directory? What about other problems?I can think of these issues:
- The directory
next_directory
exists, but you can't change to it because you don't have execute permission (or some equivalent on the server).In this case
mkdir
would try to create it and fail and thus raise an exception (as in the old implementation). If the directory existed and was the last in the chain of directories to create,makedirs
should have succeeded though. So this is a case that the old implementation covers but not the new one.If the directory is not the last in the chain,
chdir
would fail andmkdir
as well and raise aPermanentError
. The same would have happened with the old implementation and is in my opinion the expected behavior.
- The "directory"
next_directory
exists, but is a file.Here,
chdir
would fail and then themkdir
as well, which would be the expected behavior.
- You aren't allowed to change to
/aaa
though you might be allowed to change to/aaa/bbb
.On Unix, you can't change to
/aaa/bbb
if you can't change to/aaa
if I'm not mistaken. I don't know if the same is true for FTP servers. It might be but due to some anomaly it might not be. If this anomaly applies, the old implementation might work but not the new one.
- What cases might I have forgotten? ...
Can you think of a way to counter the above problems with the new approach but still keeping the advantages of the old and new one? Ok, the last bullet item seems obscure to me, but the problem from the first one not so much. But even if we were able to solve the problem from the first item, it would mean we make
makedirs
work for one obscure case, but fail for another obscure case. ;-/
Roger suggested this implementation which is a mix of the original implementation and his.
# Ignore unused argument `mode` # pylint: disable=unused-argument def makedirs(self, path, mode=None): """ Make the directory `path`, but also make not yet existing intermediate directories, like `os.makedirs`. The value of `mode` is only accepted for compatibility with `os.makedirs` but otherwise ignored. """ path = ftputil.tool.as_unicode(path) path = self.path.abspath(path) directories = path.split(self.sep) old_dir = self.getcwd() try: # Try to build the directory chain from the "uppermost" to # the "lowermost" directory. for index in range(1, len(directories)): # Re-insert the separator which got lost by using `path.split`. next_directory = self.sep + self.path.join(*directories[:index+1]) try: self.chdir(next_directory) except ftputil.error.PermanentError: try: self.mkdir(next_directory) except ftputil.error.PermanentError: # Find out the cause of the error. Re-raise the # exception only if the directory didn't exist already, # else something went _really_ wrong, e. g. there's a # regular file with the name of the directory. if not self.path.isdir(next_directory): raise finally: self.chdir(old_dir)
(I changed the last
except
to afinally
.)
I had been thinking about this when I answered your other mail, but this solution again has the
isdir
call, so I thought you would run into the same problem as before.Now looking at this code, it seems that it would cover all cases the old code did. Also it passes all the unit tests I currently have. :-)
I think it would fail if there was a combination of the special case where you can't change into the last directory and the server pretends the directory is empty.
Still, I can't bring myself to include this in the regular ftputil because it's a workaround for a seemingly very rare case - and it has a complex nested structure of try - for - try - try. The complexity issue could maybe be remedied by extracting a part of the method into a new one, but the problem still seems too special to me to warrant the more complex implementation.
Admittedly, there's also (even more) complexity in ftputil because of problems with whitespace-containing directory names, but this was or is a problem with several servers, so the additional complexity seemed warranted.
For now, I'd recommend you override the original
makedirs
with your implementation in a derived class and use it instead of the nativeFTPHost
. If it fits your problem better, you may even want to monkey-patch the original class.
I added the above implementation suggestion as a patch in case someone else wants to apply it to the original code. That said, I rather recommend overriding the method in a derived class.
As Roger pointed out, the concept is known as "virtual directories" and it seems to be mostly used by Microsoft. So this doesn't seem to be so rare and "anomal" than I thought. The effect of virtual directories is that directory listings don't reflect all the actual contents.
If a server uses virtual directories, this will affect not only
makedirs
, but directly and indirectly all ftputil commands that directory rely on listings, likestat
,path.getmtime
,path.isdir
, to name a few.Possibly (some) problems due to virtual directories could be handled with a generic helper like
FTPHost._robust_ftp_command
instead of adding special workarounds to almost every API.As another approach, I wonder if it's feasible to let ftputil "remember" paths it has ever seen and "overlay" them with the visible directory contents. For example, if the user logs into the server and finds himself/herself in
/aaa/bbb
, althoughaaa
isn't visible in/
, ftputil could still store that/
has anaaa
directory and/aaa
has abbb
directory. However, there's the problem that if the contents of/
can't be retrieved, stat data foraaa
will be missing. The same may apply to other directory levels.
I had some other ideas. First, for reference, the current implementation is:
def makedirs(self, path, mode=None): """ Make the directory `path`, but also make not yet existing intermediate directories, like `os.makedirs`. The value of `mode` is only accepted for compatibility with `os.makedirs` but otherwise ignored. """ path = ftputil.tool.as_unicode(path) path = self.path.abspath(path) directories = path.split(self.sep) # Try to build the directory chain from the "uppermost" to # the "lowermost" directory. for index in range(1, len(directories)): # Re-insert the separator which got lost by using `path.split`. next_directory = self.sep + self.path.join(*directories[:index+1]) try: self.mkdir(next_directory) except ftputil.error.PermanentError: # Find out the cause of the error. Re-raise the # exception only if the directory didn't exist already, # else something went _really_ wrong, e. g. there's a # regular file with the name of the directory. if not self.path.isdir(next_directory): raise
Variant 1
Before iterating over the directories, get the current directory. Assume that this directory and all its parent directories exist. :-) In the following loop, ignore all directories where the path is contained in the previously determined current directory.
Pros:
- In my opinion easier to understand than the revised algorithm above.
- Should work as long as you changed into your virtual directory before calling
makedirs
.- Will save some server requests for the path components that already exist according to the current directory.
- Should be easy to understand as long as the "ignore all directories where the path is contained in the previously determined current directory" part can be expressed understandably in an extracted method.
Cons:
- Won't solve the original problem if the client did not change into "its" virtual directory before.
- This might make it (even more) difficult to understand what's going on if you're not in the virtual directory and the command fails. For a moment I thought about storing past current directories, but some of them could be deleted. To keep track of this, we'd need to add logic to update the "previously seen current directories" cache, which in my opinion isn't worth it.
Variant 2
Check if the full directory to make already exists. If yes, we're done (but see con below).
Split the full potential directory path into components as in the current implementation. Then start (trying) to create directories with the last directory (right to left), until this succeeds. (If we run into the root and still can't create a directory, raise an exception.) Then use
mkdir
to add the previously "missing" directory components from left to right.Pros:
- In my opinion easier to understand than the revised algorithm above if we can drop the nested
try
s.- Doesn't depend on the current directory like variant 1.
Cons:
- Might require more server requests because the algorithm tries to create a sequence of potentially non-existing directories. On the other hand, the current implementation has a similar weakness in that it might try to create directories that already exist.
- The initial check for the presence of the directory is critical. If the directory "above" the last directory is a virtual directory, a test with
path.isdir
won't see the last directory and returnFalse
. If the last directory doesn't have "change-into" permission, a test withchdir
will fail. This con is in common with the original implementation suggestion. That said, if we start at the end, it's less likely that we run into the invisible-directory problem.- It might be difficult to analyze the algorithm since
mkdir
has an additional failure mode apart from "no permission" and "already exists": "parent directory doesn't exist".Overall discussion
If we can assume that the current directory of the FTP client is inside the virtual directory, I like variant 1 the best.
Roger's suggestion would be very likely ok for me if the loop body is extracted into its own method. A slight annoyance is that by using
chdir
, the extracted method will modify some implicit instance state but not restore it.Variant 2 would be fine if there wasn't still the limitation with recognizing the existence of a directory. This variant also has an additional failure mode. When I initially thought about this algorithm, it seemed quite elegant, but at this point I hadn't thought about the case that the full directory might already exist.
I plan to use Roger's suggestion from comment 4. However, I first want to create a unit test first that involves a simulated virtual directory and thus fails for the old implementation.
Replying to schwa:
I plan to use Roger's suggestion from comment 4. However, I first want to create a unit test first that involves a simulated virtual directory and thus fails for the old implementation.
I applied a variant of Roger's patch in [271da08b936c34eb309298e00af5ce9ae21fe146](https://git.sr.ht/~sschwarzer/ftputil/commit/271da08b936c34eb309298e00af5ce9ae21fe146 "Apply a variant of Roger Demetrescu's patch (see ticket #86) Thanks, ..."). The only difference between the changeset and Roger's patch is some reformatting.
I didn't implement unit tests yet because this would probably very complicated with the current mock code in
test.mock_ftplib
. I plan to add tests for this ticket when I migrate the tests to use themock
library (see also ticket #98).