-
Notifications
You must be signed in to change notification settings - Fork 54
Open
Description
All the code in redun.File and friends assumes we have a single valid utf-8 string for the path.
But python accepts bytes as the path objects too. This is needed when we're working with filesystems that encode filenames using something else than UTF-8.
There's some functions that crash when trying to give File a bytes path:
- File:
get_filesystem_class()-get_proto()- the urlparse method fails onnon-utf8byte strings - Dir: all of the above, and also concatenating the glob pattern - complains that
TypeError: Can't mix strings and bytes in path components
The workaround is to hack:
I also tried changing the self.classes.File but it can't be overwritten (uses getitem) - so one would have to replace this whole FileClasses thing.
I think one of two things can be done here:
- either fix File, Dir and friends to work with non-uft8 bytestrings paths
- or, allow the user of the library to override the
FileClasses,get_filesystem_classand friends, without so much monkeypatching - refactor the whole thing to only use
pathlib.Pathas requested in Use pathlib.Path instead of strings for path #8- through I think the
get_proto()andurlparsewould still crash when given non-utf8 bytestrings
- through I think the
What do you think?
Metadata
Metadata
Assignees
Labels
No labels