== Motivation == This recipe evolved after discussion of another recipe, [wiki:MarkupRecipes/SconsXIncludeScanner Implicit dependencies with scons], on [IrcChannel #markup]. The solution, and the source code used in this recipe, was kindly provided by Cristopher Lenz (aka:cmlenz). When working with a relatively large set of xml sources that make use of XIncludes there is a common question that comes in two forms: * What files '''will''' be included by a particular source ? * What files '''were''' included by a particular source ? This recipe seeks to address the first sense as far as is possible. [wiki:MarkupRecipes/SconsXIncludeScanner Implicit dependencies with scons] could be used as a starting point for answering the second. == Code == scan-includes.py: {{{ #!python """Recursive xincludes scanner for Markup This solution was kindly provided by Christopher Lenz """ import os,sys from markup.core import START from markup.filters import IncludeFilter from markup.input import XMLParser def scan_xincludes(filename): basedir, filename = os.path.split(filename) namespace = IncludeFilter.NAMESPACE includes = set([filename]) notfound = set() visited = set() def collect(filename): try: fileobj = open(os.path.join(basedir, filename), 'U') try: for kind, data, pos in XMLParser(fileobj, filename=filename): if kind is START: tag, attrib = data if tag in namespace and tag.localname == 'include': includes.add(attrib.get('href')) finally: fileobj.close() visited.add(filename) except IOError: includes.remove(filename) notfound.add(filename) while len(includes) > len(visited): for filename in includes - visited: collect(filename) return includes,notfound if __name__ == '__main__': includes,notfound = scan_xincludes(sys.argv[1]) for include in includes: print include if notfound: print "WARNING: the follwing include hrefs were not found:" for ref in notfound: print ref }}} == Limitations == No consideration is given to conditional includes. All includes, that refer to existent files, are listed. If you make use of conditional includes, this scanner will yield false positives. No attempt is made to handle includes that make use of dynamically generated file names. Any such references will end up in the 'notfound' set. So this recipe can only reliably answer "What files '''may''' be included by a particular source?" == Discussion == Markup syntax supports conditional includes and includes whose target file names are dynamic. The latter makes it impossible to know for certain "before the show", which files '''will''' be included. Conditional includes that depend on static state could be determined before the show. This is, however, far from trivial. Integrating Markup, or anything like it, into a build system is a typical scenario that prompts these questions. Typically you will want automatic dependencies, and reliable, but minimal, rebuilds in the event that any of your source files are changed. For build system dependencies the consequence of false positives is often acceptable. The consequence being more sources are rebuilt than strictly necessary. And, answering the latter form of the question, "what files '''were''' included" is usually sufficient for ensuring re-builds are both minimal and correct.