sasdata.data_util.registry module

File extension registry.

This provides routines for opening files based on extension, and registers the built-in file extensions.

class sasdata.data_util.registry.CustomFileOpen(filename, mode='rb')

Bases: object

Custom context manager to fetch file contents depending on where the file is located.

__dict__ = mappingproxy({'__module__': 'sasdata.data_util.registry', '__doc__': 'Custom context manager to fetch file contents depending on where the file is located.', '__init__': <function CustomFileOpen.__init__>, '__enter__': <function CustomFileOpen.__enter__>, '__exit__': <function CustomFileOpen.__exit__>, '__dict__': <attribute '__dict__' of 'CustomFileOpen' objects>, '__weakref__': <attribute '__weakref__' of 'CustomFileOpen' objects>, '__annotations__': {}})
__doc__ = 'Custom context manager to fetch file contents depending on where the file is located.'
__enter__()

A context method that either fetches a file from a URL or opens a local file.

__exit__(exc_type, exc_val, exc_tb)

Close all open file handles when exiting the context manager.

__init__(filename, mode='rb')
__module__ = 'sasdata.data_util.registry'
__weakref__

list of weak references to the object

class sasdata.data_util.registry.ExtensionRegistry

Bases: object

Associate a file loader with an extension.

Note that there may be multiple loaders for the same extension.

Example:

registry = ExtensionRegistry()

# Add an association by setting an element
registry['.zip'] = unzip

# Multiple extensions for one loader
registry['.tgz'] = untar
registry['.tar.gz'] = untar

# Generic extensions to use after trying more specific extensions;
# these will be checked after the more specific extensions fail.
registry['.gz'] = gunzip

# Multiple loaders for one extension
registry['.cx'] = cx1
registry['.cx'] = cx2
registry['.cx'] = cx3

# Show registered extensions
print registry.extensions()

# Can also register a format name for explicit control from caller
registry['cx3'] = cx3
print registry.formats()

# Retrieve loaders for a file name
registry.lookup('hello.cx') -> [cx3,cx2,cx1]

# Run loader on a filename
registry.load('hello.cx') ->
    try:
        return cx3('hello.cx')
    except:
        try:
            return cx2('hello.cx')
        except:
            return cx1('hello.cx')

# Load in a specific format ignoring extension
registry.load('hello.cx',format='cx3') ->
    return cx3('hello.cx')
__contains__(ext: str) bool
__dict__ = mappingproxy({'__module__': 'sasdata.data_util.registry', '__doc__': "\n    Associate a file loader with an extension.\n\n    Note that there may be multiple loaders for the same extension.\n\n    Example: ::\n\n        registry = ExtensionRegistry()\n\n        # Add an association by setting an element\n        registry['.zip'] = unzip\n\n        # Multiple extensions for one loader\n        registry['.tgz'] = untar\n        registry['.tar.gz'] = untar\n\n        # Generic extensions to use after trying more specific extensions;\n        # these will be checked after the more specific extensions fail.\n        registry['.gz'] = gunzip\n\n        # Multiple loaders for one extension\n        registry['.cx'] = cx1\n        registry['.cx'] = cx2\n        registry['.cx'] = cx3\n\n        # Show registered extensions\n        print registry.extensions()\n\n        # Can also register a format name for explicit control from caller\n        registry['cx3'] = cx3\n        print registry.formats()\n\n        # Retrieve loaders for a file name\n        registry.lookup('hello.cx') -> [cx3,cx2,cx1]\n\n        # Run loader on a filename\n        registry.load('hello.cx') ->\n            try:\n                return cx3('hello.cx')\n            except:\n                try:\n                    return cx2('hello.cx')\n                except:\n                    return cx1('hello.cx')\n\n        # Load in a specific format ignoring extension\n        registry.load('hello.cx',format='cx3') ->\n            return cx3('hello.cx')\n    ", '__init__': <function ExtensionRegistry.__init__>, '__setitem__': <function ExtensionRegistry.__setitem__>, '__getitem__': <function ExtensionRegistry.__getitem__>, '__contains__': <function ExtensionRegistry.__contains__>, 'formats': <function ExtensionRegistry.formats>, 'extensions': <function ExtensionRegistry.extensions>, 'lookup': <function ExtensionRegistry.lookup>, 'load': <function ExtensionRegistry.load>, '__dict__': <attribute '__dict__' of 'ExtensionRegistry' objects>, '__weakref__': <attribute '__weakref__' of 'ExtensionRegistry' objects>, '__annotations__': {}})
__doc__ = "\n    Associate a file loader with an extension.\n\n    Note that there may be multiple loaders for the same extension.\n\n    Example: ::\n\n        registry = ExtensionRegistry()\n\n        # Add an association by setting an element\n        registry['.zip'] = unzip\n\n        # Multiple extensions for one loader\n        registry['.tgz'] = untar\n        registry['.tar.gz'] = untar\n\n        # Generic extensions to use after trying more specific extensions;\n        # these will be checked after the more specific extensions fail.\n        registry['.gz'] = gunzip\n\n        # Multiple loaders for one extension\n        registry['.cx'] = cx1\n        registry['.cx'] = cx2\n        registry['.cx'] = cx3\n\n        # Show registered extensions\n        print registry.extensions()\n\n        # Can also register a format name for explicit control from caller\n        registry['cx3'] = cx3\n        print registry.formats()\n\n        # Retrieve loaders for a file name\n        registry.lookup('hello.cx') -> [cx3,cx2,cx1]\n\n        # Run loader on a filename\n        registry.load('hello.cx') ->\n            try:\n                return cx3('hello.cx')\n            except:\n                try:\n                    return cx2('hello.cx')\n                except:\n                    return cx1('hello.cx')\n\n        # Load in a specific format ignoring extension\n        registry.load('hello.cx',format='cx3') ->\n            return cx3('hello.cx')\n    "
__getitem__(ext: str) List
__init__()
__module__ = 'sasdata.data_util.registry'
__setitem__(ext: str, loader)
__weakref__

list of weak references to the object

extensions() List[str]

Return a sorted list of registered extensions.

formats() List[str]

Return a sorted list of the registered formats.

load(path: str, ext: str | None = None) List[Data1D | Data2D]

Call the loader for a single file.

Exceptions are stored in Data1D instances, with the errors in Data1D.errors

lookup(path: str) List[callable]

Return the loader associated with the file type of path.

Parameters:

path – Data file path

Returns:

List of available readers for the file extension (maybe empty)

sasdata.data_util.registry.create_empty_data_with_errors(path: str | Path, errors: List[Exception])

Create a Data1D instance that only holds errors and a filepath. This allows all file paths to return a common data type, regardless if the data loading was successful or a failure.