modulegraph.rst 16 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531
  1. :mod:`modulegraph.modulegraph` --- Find modules used by a script
  2. ================================================================
  3. .. module:: modulegraph.modulegraph
  4. :synopsis: Find modules used by a script
  5. This module defines :class:`ModuleGraph`, which is used to find
  6. the dependencies of scripts using bytecode analysis.
  7. A number of APIs in this module refer to filesystem path. Those paths can refer to
  8. files inside zipfiles (for example when there are zipped egg files on :data:`sys.path`).
  9. Filenames referring to entries in a zipfile are not marked any way, if ``"somepath.zip"``
  10. refers to a zipfile, that is ``"somepath.zip/embedded/file"`` will be used to refer to
  11. ``embedded/file`` inside the zipfile.
  12. The actual graph
  13. ----------------
  14. .. class:: ModuleGraph([path[, excludes[, replace_paths[, implies[, graph[, debug]]]]]])
  15. Create a new ModuleGraph object. Use the :meth:`run_script` method to add scripts,
  16. and their dependencies to the graph.
  17. :param path: Python search path to use, defaults to :data:`sys.path`
  18. :param excludes: Iterable with module names that should not be included as a dependency
  19. :param replace_paths: List of pathname rewrites ``(old, new)``. When this argument is
  20. supplied the ``co_filename`` attributes of code objects get rewritten before scanning
  21. them for dependencies.
  22. :param implies: Implied module dependencies, a mapping from a module name to the list
  23. of modules it depends on. Use this to tell modulegraph about dependencies that cannot
  24. be found by code inspection (such as imports from C code or using the :func:`__import__`
  25. function).
  26. :param graph: A precreated :class:`Graph <altgraph.Graph.Graph>` object to use, the
  27. default is to create a new one.
  28. :param debug: The :class:`ObjectGraph <altgraph.ObjectGraph.ObjectGraph>` debug level.
  29. .. method:: run_script(pathname[, caller])
  30. Create, and return, a node by path (not module name). The *pathname* should
  31. refer to a Python source file and will be scanned for dependencies.
  32. The optional argument *caller* is the the node that calls this script,
  33. and is used to add a reference in the graph.
  34. .. method:: import_hook(name[[, caller[, fromlist[, level, [, attr]]]])
  35. Import a module and analyse its dependencies
  36. :arg name: The module name
  37. :arg caller: The node that caused the import to happen
  38. :arg fromlist: The list of names to import, this is an empty list for
  39. ``import name`` and a list of names for ``from name import a, b, c``.
  40. :arg level: The import level. The value should be ``-1`` for classical Python 2
  41. imports, ``0`` for absolute imports and a positive number for relative imports (
  42. where the value is the number of leading dots in the imported name).
  43. :arg attr: Attributes for the graph edge.
  44. .. method:: implyNodeReference(node, other, edgeData=None)
  45. Explictly mark that *node* depends on *other*. Other is either
  46. a :class:`node <Node>` or the name of a module that will be
  47. searched for as if it were an absolute import.
  48. .. method:: createReference(fromnode, tonode[, edge_data])
  49. Create a reference from *fromnode* to *tonode*, with optional edge data.
  50. The default for *edge_data* is ``"direct"``.
  51. .. method:: getReferences(fromnode)
  52. Yield all nodes that *fromnode* refers to. That is, all modules imported
  53. by *fromnode*.
  54. Node :data:`None` is the root of the graph, and refers to all notes that were
  55. explicitly imported by :meth:`run_script` or :meth:`import_hook`, unless you use
  56. an explicit parent with those methods.
  57. .. versionadded:: 0.11
  58. .. method:: getReferers(tonode, collapse_missing_modules=True)
  59. Yield all nodes that refer to *tonode*. That is, all modules that import
  60. *tonode*.
  61. If *collapse_missing_modules* is false this includes refererences from
  62. :class:`MissingModule` nodes, otherwise :class:`MissingModule` nodes
  63. are replaced by the "real" nodes that reference this missing node.
  64. .. versionadded:: 0.12
  65. .. method:: foldReferences(pkgnode)
  66. Hide all submodule nodes for package *pkgnode* and add ingoing and outgoing
  67. edges to *pkgnode* based on the edges from the submodule nodes.
  68. This can be used to simplify a module graph: after folding 'email' all
  69. references to modules in the 'email' package are references to the package.
  70. .. versionadded: 0.11
  71. .. method:: findNode(name)
  72. Find a node by identifier. If a node by that identifier exists, it will be returned.
  73. If a lazy node exists by that identifier with no dependencies (excluded), it will be
  74. instantiated and returned.
  75. If a lazy node exists by that identifier with dependencies, it and its
  76. dependencies will be instantiated and scanned for additional depende
  77. .. method:: create_xref([out])
  78. Write an HTML file to the *out* stream (defaulting to :data:`sys.stdout`).
  79. The HTML file contains a textual description of the dependency graph.
  80. .. method:: graphreport([fileobj[, flatpackages]])
  81. .. todo:: To be documented
  82. .. method:: report()
  83. Print a report to stdout, listing the found modules with their
  84. paths, as well as modules that are missing, or seem to be missing.
  85. Mostly internal methods
  86. .......................
  87. The methods in this section should be considered as methods for subclassing at best,
  88. please let us know if you need these methods in your code as they are on track to be
  89. made private methods before the 1.0 release.
  90. .. warning:: The methods in this section will be refactored in a future release,
  91. the current architecture makes it unnecessarily hard to write proper tests.
  92. .. method:: determine_parent(caller)
  93. Returns the node of the package root voor *caller*. If *caller* is a package
  94. this is the node itself, if the node is a module in a package this is the
  95. node of for the package and otherwise the *caller* is not a package and
  96. the result is :data:`None`.
  97. .. method:: find_head_package(parent, name[, level])
  98. .. todo:: To be documented
  99. .. method:: load_tail(mod, tail)
  100. This method is called to load the rest of a dotted name after loading the root
  101. of a package. This will import all intermediate modules as well (using
  102. :meth:`import_module`), and returns the module :class:`node <Node>` for the
  103. requested node.
  104. .. note:: When *tail* is empty this will just return *mod*.
  105. :arg mod: A start module (instance of :class:`Node`)
  106. :arg tail: The rest of a dotted name, can be empty
  107. :raise ImportError: When the requested (or one of its parents) module cannot be found
  108. :returns: the requested module
  109. .. method:: ensure_fromlist(m, fromlist)
  110. Yield all submodules that would be imported when importing *fromlist*
  111. from *m* (using ``from m import fromlist...``).
  112. *m* must be a package and not a regular module.
  113. .. method:: find_all_submodules(m)
  114. Yield the filenames for submodules of in the same package as *m*.
  115. .. method:: import_module(partname, fqname, parent)
  116. Perform import of the module with basename *partname* (``path``) and
  117. full name *fqname* (``os.path``). Import is performed by *parent*.
  118. This will create a reference from the parent node to the
  119. module node and will load the module node when it is not already
  120. loaded.
  121. .. method:: load_module(fqname, fp, pathname, (suffix, mode, type))
  122. Load the module named *fqname* from the given *pathame*. The
  123. argument *fp* is either :data:`None`, or a stream where the
  124. code for the Python module can be loaded (either byte-code or
  125. the source code). The *(suffix, mode, type)* tuple are the
  126. suffix of the source file, the open mode for the file and the
  127. type of module.
  128. Creates a node of the right class and processes the dependencies
  129. of the :class:`node <Node>` by scanning the byte-code for the node.
  130. Returns the resulting :class:`node <Node>`.
  131. .. method:: scan_code(code, m)
  132. Scan the *code* object for module *m* and update the dependencies of
  133. *m* using the import statemets found in the code.
  134. This will automaticly scan the code for nested functions, generator
  135. expressions and list comprehensions as well.
  136. .. method:: load_package(fqname, pathname)
  137. Load a package directory.
  138. .. method:: find_module(name, path[, parent])
  139. Locates a module named *name* that is not yet part of the
  140. graph. This method will raise :exc:`ImportError` when
  141. the module cannot be found or when it is already part
  142. of the graph. The *name* can not be a dotted name.
  143. The *path* is the search path used, or :data:`None` to
  144. use the default path.
  145. When the *parent* is specified *name* refers to a
  146. subpackage of *parent*, and *path* should be the
  147. search path of the parent.
  148. Returns the result of the global function
  149. :func:`find_module <modulegraph.modulegraph.find_module>`.
  150. .. method:: itergraphreport([name[, flatpackages]])
  151. .. todo:: To be documented
  152. .. method:: replace_paths_in_code(co)
  153. Replace the filenames in code object *co* using the *replace_paths* value that
  154. was passed to the contructor. Returns the rewritten code object.
  155. .. method:: calc_setuptools_nspackages()
  156. Returns a mapping from package name to a list of paths where that package
  157. can be found in ``--single-version-externally-managed`` form.
  158. This method is used to be able to find those packages: these use
  159. a magic ``.pth`` file to ensure that the package is added to :data:`sys.path`,
  160. as they do not contain an ``___init__.py`` file.
  161. Packages in this form are used by system packages and the "pip"
  162. installer.
  163. Graph nodes
  164. -----------
  165. The :class:`ModuleGraph` contains nodes that represent the various types of modules.
  166. .. class:: Alias(value)
  167. This is a subclass of string that is used to mark module aliases.
  168. .. class:: Node(identifier)
  169. Base class for nodes, which provides the common functionality.
  170. Nodes can by used as mappings for storing arbitrary data in the node.
  171. Nodes are compared by comparing their *identifier*.
  172. .. data:: debug
  173. Debug level (integer)
  174. .. data:: graphident
  175. The node identifier, this is the value of the *identifier* argument
  176. to the constructor.
  177. .. data:: identifier
  178. The node identifier, this is the value of the *identifier* argument
  179. to the constructor.
  180. .. data:: filename
  181. The filename associated with this node.
  182. .. data:: packagepath
  183. The value of ``__path__`` for this node.
  184. .. data:: code
  185. The :class:`code object <types.CodeObject>` associated with this node
  186. .. data:: globalnames
  187. The set of global names that are assigned to in this module. This
  188. includes those names imported through startimports of Python modules.
  189. .. data:: startimports
  190. The set of startimports this module did that could not be resolved,
  191. ie. a startimport from a non-Python module.
  192. .. method:: __contains__(name)
  193. Return if there is a value associated with *name*.
  194. This method is usually accessed as ``name in aNode``.
  195. .. method:: __setitem__(name, value)
  196. Set the value of *name* to *value*.
  197. This method is usually accessed as ``aNode[name] = value``.
  198. .. method:: __getitem__(name)
  199. Returns the value of *name*, raises :exc:`KeyError` when
  200. it cannot be found.
  201. This method is usually accessed as ``value = aNode[name]``.
  202. .. method:: get(name[, default])
  203. Returns the value of *name*, or the default value when it
  204. cannot be found. The *default* is :data:`None` when not specified.
  205. .. method:: infoTuple()
  206. Returns a tuple with information used in the :func:`repr`
  207. output for the node. Subclasses can add additional informations
  208. to the result.
  209. .. class:: AliasNode (name, node)
  210. A node that represents an alias from a name to another node.
  211. The value of attribute *graphident* for this node will be the
  212. value of *name*, the other :class:`Node` attributed are
  213. references to those attributed in *node*.
  214. .. class:: BadModule(identifier)
  215. Base class for nodes that should be ignored for some reason
  216. .. class:: ExcludedModule(identifier)
  217. A module that is explicitly excluded.
  218. .. class:: MissingModule(identifier)
  219. A module that is imported but cannot be located.
  220. .. class:: Script(filename)
  221. A python script.
  222. .. data:: filename
  223. The filename for the script
  224. .. class:: BaseModule(name[, filename[, path]])
  225. The base class for actual modules. The *name* is
  226. the possibly dotted module name, *filename* is the
  227. filesystem path to the module and *path* is the
  228. value of ``__path__`` for the module.
  229. .. data:: graphident
  230. The name of the module
  231. .. data:: filename
  232. The filesystem path to the module.
  233. .. data:: path
  234. The value of ``__path__`` for this module.
  235. .. class:: BuiltinModule(name)
  236. A built-in module (on in :data:`sys.builtin_module_names`).
  237. .. class:: SourceModule(name)
  238. A module for which the python source code is available.
  239. .. class:: InvalidSourceModule(name)
  240. A module for which the python source code is available, but where
  241. that source code cannot be compiled (due to syntax errors).
  242. This is a subclass of :class:`SourceModule`.
  243. .. versionadded:: 0.12
  244. .. class:: CompiledModule(name)
  245. A module for which only byte-code is available.
  246. .. class:: Package(name)
  247. Represents a python package
  248. .. class:: NamespacePackage(name)
  249. Represents a python namespace package.
  250. This is a subclass of :class:`Package`.
  251. .. class:: Extension(name)
  252. A native extension
  253. .. warning:: A number of other node types are defined in the module. Those modules aren't
  254. used by modulegraph and will be removed in a future version.
  255. Edge data
  256. ---------
  257. The edges in a module graph by default contain information about the edge, represented
  258. by an instance of :class:`DependencyInfo`.
  259. .. class:: DependencyInfo(conditional, function, tryexcept, fromlist)
  260. This class is a :func:`namedtuple <collections.namedtuple>` for representing
  261. the information on a dependency between two modules.
  262. All attributes can be used to deduce if a dependency is essential or not, and
  263. are particularly useful when reporting on missing modules (dependencies on
  264. :class:`MissingModule`).
  265. .. data:: fromlist
  266. A boolean that is true iff the target of the edge is named in the "import"
  267. list of a "from" import ("from package import module").
  268. When the target module is imported multiple times this attribute is false
  269. unless all imports are in "import" list of a "from" import.
  270. .. data:: function
  271. A boolean that is true iff the import is done inside a function definition,
  272. and is false for imports in module scope (or class scope for classes that
  273. aren't definined in a function).
  274. .. data:: tryexcept
  275. A boolean that is true iff the import that is done in the "try" or "except"
  276. block of a try statement (but not in the "else" block).
  277. .. data:: conditional
  278. A boolean that is true iff the import is done in either block of an "if"
  279. statement.
  280. When the target of the edge is imported multiple times the :data:`function`,
  281. :data:`tryexcept` and :data:`conditional` attributes of all imports are
  282. merged: when there is an import where all these attributes are false the
  283. attributes are false, otherwise each attribute is set to true if it is
  284. true for at least one of the imports.
  285. For example, when a module is imported both in a try-except statement and
  286. furthermore is imported in a function (in two separate statements),
  287. both :data:`tryexcept` and :data:`function` will be true. But if there
  288. is a third unconditional toplevel import for that module as well all
  289. three attributes are false.
  290. .. warning::
  291. All attributes but :data:`fromlist` will be false when the source of
  292. a dependency is scanned from a byte-compiled module instead of a python
  293. source file. The :data:`fromlist` attribute will stil be set correctly.
  294. Utility functions
  295. -----------------
  296. .. function:: find_module(name[, path])
  297. A version of :func:`imp.find_module` that works with zipped packages (and other
  298. :pep:`302` importers).
  299. .. function:: moduleInfoForPath(path)
  300. Return the module name, readmode and type for the file at *path*, or
  301. None if it doesn't seem to be a valid module (based on its name).
  302. .. function:: addPackagePath(packagename, path)
  303. Add *path* to the value of ``__path__`` for the package named *packagename*.
  304. .. function:: replacePackage(oldname, newname)
  305. Rename *oldname* to *newname* when it is found by the module finder. This
  306. is used as a workaround for the hack that the ``_xmlplus`` package uses
  307. to inject itself in the ``xml`` namespace.