sparc.docparser module
A module to parse the latex documents provided by SPARC and convert to its Python API
Created on Wed Mar 1 15:32:31 EST 2023
Tian Tian (alchem0x2a@gmail.com)
- class sparc.docparser.SparcDocParser(directory='.', main_file='*Manual.tex', intro_file='Introduction.tex', params_from_intro=True, parse_version=True)[source]
Bases:
object
Parses LaTeX documentation of SPARC-X and converts it into a Python API.
This class extracts parameter information from LaTeX source files, organizing it into a structured format that can be easily used in Python. It supports parsing of version details, parameter types, units, and other relevant information.
- version
Parsed SPARC version, based on the documentation.
- Type:
str
- parameter_categories
Categories of parameters extracted.
- Type:
list
- parameters
Extracted parameters with detailed information.
- Type:
dict
- other_parameters
Additional parameters not categorized.
- Type:
dict
- json_from_directory(directory, include_subdirs, **kwargs)[source]
Class method to create JSON from a directory.
- json_from_repo(url, version, include_subdirs, **kwargs)[source]
Class method to create JSON from a repository.
- find_main_file(main_file_pattern)[source]
Finds the main LaTeX file that matches the given pattern, e.g. Manual.tex or Manual_cyclix.te
- Parameters:
main_file_pattern (str) – Pattern to match the main LaTeX file name.
- Returns:
Path to the main LaTeX file.
- Return type:
Path
- Raises:
FileNotFoundError – If no or multiple files match the pattern.
- get_include_files()[source]
Retrieves a list of LaTeX files included in the main LaTeX document, e.g. Manual.tex.
- Returns:
A list of paths to the included LaTeX files.
- Return type:
list
- classmethod json_from_directory(directory='.', include_subdirs=True, **kwargs)[source]
Recursively add parameters from all Manual files :param directory: The directory to the LaTeX files, e.g. <sparc-root>/doc/.LaTeX :type directory: str or PosixPath :param include_subdirs: If true, also parse the manual files in submodules, e.g. cyclix, highT :type include_subdirs: bool
- Returns:
Formatted json-string of the API
- Return type:
str
- classmethod json_from_repo(url='https://github.com/SPARC-X/SPARC.git', version='master', include_subdirs=True, **kwargs)[source]
Download the source code from git and use json_from_directory to parse :param url: URL for the repository of SPARC, default is “https://github.com/SPARC-X/SPARC.git” :type url: str :param version: Git version or commit hash of the SPARC repo :type version: str :param include_subdirs: If true, also parse the manual files in submodules, e.g. cyclix, highT :type include_subdirs: bool
- Returns:
Formatted json-string of the API
- Return type:
str
- parse_parameters()[source]
The actual thing for parsing parameters
- Sets:
parameters (dict): All parsed parameters parameter_categoris (list): List of categories other_parameters (dict): Any parameters that are not included in the categories
- parse_version(parse=True)[source]
Parses and sets the SPARC version based on the C-source file, if possible. The date for the SPARC code is parsed from initialization.c in the “YYYY.MM.DD” format.
- Parameters:
parse (bool) – Whether to parse the version from the documentation.
- Sets:
- self.version (str): The parsed version in ‘YYYY.MM.DD’ format or None,
if either parse=False, or the C-source code is missing
- sparc.docparser.convert_comment(text)[source]
Used to remove TeX-specific commands in description and remarks as much as possible
- Parameters:
text (str) – Raw LaTeX code for the comment section in manual
- Returns:
Sanitized plain text
- Return type:
str
- sparc.docparser.convert_tex_default(text, desired_type=None)[source]
Convert default values as much as possible. The desire type will convert the default values to the closest format
Currently supported conversions 1. Remove all surrounding text modifiers (texttt) 2. Remove all symbol wrappers $ 3. Convert value to single or array
- Parameters:
text (str) – Raw text string for value
desired_type (str or None) – Data type to be converted to. If None, preserve the string format
- Returns:
Value converted from raw text
- Return type:
converted
- sparc.docparser.convert_tex_example(text)[source]
Convert TeX codes of examples as much as possible The examples follow the format SYMBOL: values (may contain new lines) :param text: Single or multiline LaTeX contents :type text: str
- Returns:
Sanitized literal text
- Return type:
str
- sparc.docparser.convert_tex_parameter(text)[source]
Conver a TeX string to non-escaped name (for parameter only) :param text: Parameter name in LaTeX format :type text: str
- Returns:
Text with sanitized parameter
- Return type:
str
- sparc.docparser.is_array(text)[source]
Simply try to convert a string into a numpy array and compare if length is larger than 1 it is only used to compare a float / int value
- sparc.docparser.sanitize_default(param_dict)[source]
Sanitize the default field 1. Create an extra field default_remark that copies original default 2. Use convert_tex_default to convert values as much as possible
This function should be called after sanitize_type
- sparc.docparser.sanitize_description(param_dict)[source]
Sanitize the description and remark field
- Parameters:
param_dict (dict) – Raw dict for one parameter entry
- Returns:
- Sanitized parameter dict with comment, remark and description
converted to human-readable formats
- Return type:
dict
- sparc.docparser.sanitize_type(param_dict)[source]
Sanitize the param dict so that the type are more consistent
For example, if type is Double / Integer, but parameter is a vector, make a double vector or integer vector
- sparc.docparser.text2value(text, desired_type)[source]
Convert raw text to a desired type
- Parameters:
text (str) – Text contents for the value
desired_type (str) – Target data type from ‘string’, ‘integer’, ‘integer array’, ‘double’, ‘double array’, ‘bool’, ‘bool array’
- Returns:
Value converted to the desired type
- Return type:
converted