Dealing with 301 Redirects and Recursion
So, I came across an interesting scenario earlier today. I was trying to recursively download a website with
The website is hosted using
nginx and uses the
auto_index module. Which means a request for
/foo/ will automatically generate an index file and serve it.
The exact command I used was
wget2 --no-parent -r example.com/foo/bar where
bar is a directory. So, as expected, the server responds with a
301 Permanently Moved to
/foo/bar/ and then proceeds to serve an index file.
wget2 doesn't accept the new server name for the download, the iri still contains
/foo/bar as the location. This means all files in
/foo/ are also considered as part of the current directory even though they really are a part of the parent.
Now, this can be easily dealt with if we simply consider the IRI to the new one. But it may cause interesting side-effects. So I want to discuss this here before making any changes