Commit 2fd0ee03 authored by SVN-Git Migration's avatar SVN-Git Migration

Imported Upstream version 4.0.0~b5

parent 43411cb6
Behold, mortal, the origins of Beautiful Soup...
================================================
Leonard Richardson is the primary programmer.
Aaron DeVore is awesome.
Mark Pilgrim provided the encoding detection code that forms the base
of UnicodeDammit.
Thomas Kluyver and Ezio Melotti finished the work of getting Beautiful
Soup 4 working under Python 3.
Sam Ruby helped with a lot of edge cases.
Jonathan Ellis was awarded the prestigous Beau Potage D'Or for his
work in solving the nestable tags conundrum.
The following people have contributed patches to Beautiful Soup:
Istvan Albert, Andrew Lin, Anthony Baxter, Andrew Boyko, Tony Chang,
Zephyr Fang, Fuzzy, Roman Gaufman, Yoni Gilad, Richie Hindle, Peteris
Krumins, Kent Johnson, Ben Last, Robert Leftwich, Staffan Malmgren,
Ksenia Marasanova, JP Moins, Adam Monsen, John Nagle, "Jon", Ed
Oskiewicz, Greg Phillips, Giles Radford, Arthur Rudolph, Marko
Samastur, Jouni Seppnen, Alexander Schmolck, Andy Theyers, Glyn
Webster, Paul Wright, Danny Yoo
The following people made suggestions or found bugs or found ways to
break Beautiful Soup:
Hanno Bck, Matteo Bertini, Chris Curvey, Simon Cusack, Bruce Eckel,
Matt Ernst, Michael Foord, Tom Harris, Bill de hOra, Donald Howes,
Matt Patterson, Scott Roberts, Steve Strassmann, Mike Williams,
warchild at redho dot com, Sami Kuisma, Carlos Rocha, Bob Hutchison,
Joren Mc, Michal Migurski, John Kleven, Tim Heaney, Tripp Lilley, Ed
Summers, Dennis Sutch, Chris Smith, Aaron Sweep^W Swartz, Stuart
Turner, Greg Edwards, Kevin J Kalupson, Nikos Kouremenos, Artur de
Sousa Rocha, Yichun Wei, Per Vognsen
Beautiful Soup is made available under the MIT license:
Copyright (c) 2004-2011 Leonard Richardson
Permission is hereby granted, free of charge, to any person obtaining
a copy of this software and associated documentation files (the
"Software"), to deal in the Software without restriction, including
without limitation the rights to use, copy, modify, merge, publish,
distribute, sublicense, and/or sell copies of the Software, and to
permit persons to whom the Software is furnished to do so, subject to
the following conditions:
The above copyright notice and this permission notice shall be
included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE, DAMMIT.
Beautiful Soup incorporates code from the html5lib library, which is
also made available under the MIT license.
This diff is collapsed.
Metadata-Version: 1.0
Name: beautifulsoup4
Version: 4.0.0b3
Version: 4.0.0b5
Summary: UNKNOWN
Home-page: http://www.crummy.com/software/BeautifulSoup/bs4/
Author: Leonard Richardson
Author-email: leonardr@segfault.org
License: MIT
Download-URL: http://www.crummy.com/software/BeautifulSoup/download/4.x/
Download-URL: http://www.crummy.com/software/BeautifulSoup/bs4/download/
Description: Beautiful Soup sits atop an HTML or XML parser, providing Pythonic idioms for iterating, searching, and modifying the parse tree.
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
......
......@@ -34,91 +34,29 @@
</tag3>
</tag1>
= About Beautiful Soup 4 =
= Full documentation =
This is a nearly-complete rewrite that removes Beautiful Soup's custom
HTML parser in favor of a system that lets you write a little glue
code and plug in any HTML or XML parser you want.
Beautiful Soup 4.0 comes with glue code for four parsers:
* Python's standard HTMLParser (html.parser in Python 3)
* lxml's HTML and XML parsers
* html5lib's HTML parser
HTMLParser is the default, but I recommend you install one of the
other parsers, or you'll have problems handling real-world markup.
For complete documentation, see the Sphinx documentation in
docs/source. What follows is a summary of the changes from Beautiful
Soup 3.
== The module name has changed ==
Previously you imported the BeautifulSoup class from a module also
called BeautifulSoup. To save keystrokes and make it clear which
version of the API is in use, the module is now called 'bs4':
>>> from bs4 import BeautifulSoup
== It works with Python 3 ==
Beautiful Soup 3.1.0 worked with Python 3, but the parser it used was
so bad that it barely worked at all. Beautiful Soup 4 works with
Python 3, and since its parser is pluggable, you don't sacrifice
quality.
Special thanks to Thomas Kluyver and Ezio Melotti for getting Python 3
support to the finish line. Ezio Melotti is also to thank for greatly
improving the HTML parser that comes with Python 3.2.
== CDATA sections are normal text, if they're understood at all. ==
Currently, the lxml and html5lib HTML parsers ignore CDATA sections in
markup:
<p><![CDATA[foo]]></p> => <p></p>
A future version of html5lib will turn CDATA sections into text nodes,
but only within tags like <svg> and <math>:
<svg><![CDATA[foo]]></svg> => <p>foo</p>
The default XML parser (which uses lxml behind the scenes) turns CDATA
sections into ordinary text elements:
<p><![CDATA[foo]]></p> => <p>foo</p>
In theory it's possible to preserve the CDATA sections when using the
XML parser, but I don't see how to get it to work in practice.
== Miscellaneous other stuff ==
If the BeautifulSoup instance has .is_xml set to True, an appropriate
XML declaration will be emitted when the tree is transformed into a
string:
<?xml version="1.0" encoding="utf-8">
<markup>
...
</markup>
The ['lxml', 'xml'] tree builder sets .is_xml to True; the other tree
builders set it to False. If you want to parse XHTML with an HTML
parser, you can set it manually.
The bs4/doc/ directory contains full documentation in Sphinx
format. Run "make html" in that directory to create HTML
documentation.
= Running the unit tests =
Here's how to run the tests on Python 2.7:
Beautiful Soup supports unit test discovery. You can run the tests
from the project root directory with this command:
$ cd bs4
$ python2.7 -m unittest discover -s bs4
$ python -m unittest discover -s bs4
Here's how to do it with Python 3.2:
If you checked out the source tree, you should see a script in the
home directory called test-all-versions. This script will run the unit
tests under Python 2.7, then create a temporary Python 3 conversion of
the source and run the unit tests again under Python 3.
$ ./convert-py3k
$ cd py3k/bs4
$ python3 -m unittest discover -s bs4
= Links =
The script test-all-versions will run the tests twice, once on Python
2.7 and once on Python 3.
Homepage: http://www.crummy.com/software/BeautifulSoup/bs4/
Documentation: http://www.crummy.com/software/BeautifulSoup/bs4/doc/
http://readthedocs.org/docs/beautiful-soup-4/
Discussion group: http://groups.google.com/group/beautifulsoup/
Development: https://code.launchpad.net/beautifulsoup/
Bug tracker: https://bugs.launchpad.net/beautifulsoup/
from bs4 import BeautifulSoup
BeautifulSoup(open("838800.html"), "html5lib")
......@@ -18,7 +18,7 @@ http://www.crummy.com/software/BeautifulSoup/documentation.html
"""
__author__ = "Leonard Richardson (leonardr@segfault.org)"
__version__ = "4.0.0b3"
__version__ = "4.0.0b5"
__copyright__ = "Copyright (c) 2004-2012 Leonard Richardson"
__license__ = "MIT"
......@@ -161,7 +161,8 @@ class BeautifulSoup(Tag):
if hasattr(markup, 'read'): # It's a file-type object.
markup = markup.read()
self.markup, self.original_encoding, self.declared_html_encoding = (
(self.markup, self.original_encoding, self.declared_html_encoding,
self.contains_replacement_characters) = (
self.builder.prepare_markup(markup, from_encoding))
try:
......@@ -169,10 +170,10 @@ class BeautifulSoup(Tag):
except StopParsing:
pass
# Clear out the markup and the builder so they can be CGed.
# Clear out the markup and remove the builder's circular
# reference to this object.
self.markup = None
self.builder.soup = None
self.builder = None
def _feed(self):
# Convert the document to Unicode.
......@@ -195,7 +196,19 @@ class BeautifulSoup(Tag):
def new_tag(self, name, **attrs):
"""Create a new tag associated with this soup."""
return Tag(None, None, name, attrs)
return Tag(None, self.builder, name, attrs)
def new_string(self, s):
"""Create a new NavigableString associated with this soup."""
navigable = NavigableString(s)
navigable.setup()
return navigable
def insert_before(self, successor):
raise ValueError("BeautifulSoup objects don't support insert_before().")
def insert_after(self, successor):
raise ValueError("BeautifulSoup objects don't support insert_after().")
def popTag(self):
tag = self.tagStack.pop()
......@@ -297,9 +310,10 @@ class BeautifulSoup(Tag):
def decode(self, pretty_print=False,
eventual_encoding=DEFAULT_OUTPUT_ENCODING,
substitute_html_entities=False):
formatter="minimal"):
"""Returns a string or Unicode representation of this document.
To get Unicode, pass None for encoding."""
if self.is_xml:
# Print the XML declaration
encoding_part = ''
......@@ -313,8 +327,7 @@ class BeautifulSoup(Tag):
else:
indent_level = 0
return prefix + super(BeautifulSoup, self).decode(
indent_level, eventual_encoding,
substitute_html_entities)
indent_level, eventual_encoding, formatter)
class StopParsing(Exception):
......
......@@ -72,7 +72,6 @@ class TreeBuilderRegistry(object):
# to look up builders in this registry.
builder_registry = TreeBuilderRegistry()
class TreeBuilder(object):
"""Turn a document into a Beautiful Soup object tree."""
......@@ -83,6 +82,11 @@ class TreeBuilder(object):
empty_element_tags = None # A tag will be considered an empty-element
# tag when and only when it has no contents.
# A value for these attributes is a space- or comma-separated list
# of CDATA, rather than a single CDATA.
cdata_list_attributes = None
def __init__(self):
self.soup = None
......@@ -115,7 +119,7 @@ class TreeBuilder(object):
def prepare_markup(self, markup, user_specified_encoding=None,
document_declared_encoding=None):
return markup, None, None
return markup, None, None, False
def test_fragment_to_document(self, fragment):
"""Wrap an HTML fragment to make it look like a document.
......@@ -190,6 +194,16 @@ class HTMLTreeBuilder(TreeBuilder):
empty_element_tags = set(['br' , 'hr', 'input', 'img', 'meta',
'spacer', 'link', 'frame', 'base'])
# The HTML standard defines these attributes as containing a
# space-separated list of values, not a single value. That is,
# class="foo bar" means that the 'class' attribute has two values,
# 'foo' and 'bar', not the single value 'foo bar'. When we
# encounter one of these attributes, we will parse its value into
# a list of values if possible. Upon output, the list will be
# converted back into a string.
cdata_list_attributes = set(
['class', 'rel', 'rev', 'archive', 'accept-charset', 'headers'])
# Used by set_up_substitutions to detect the charset in a META tag
CHARSET_RE = re.compile("((^|;)\s*charset=)([^;]*)", re.M)
......@@ -244,20 +258,20 @@ def register_treebuilders_from(module):
this_module.builder_registry.register(obj)
# Builders are registered in reverse order of priority, so that custom
# builder registrations will take precedence. In general, we want
# html5lib to take precedence over lxml, because it's more
# reliable. And we only want to use HTMLParser as a last result.
# builder registrations will take precedence. In general, we want lxml
# to take precedence over html5lib, because it's faster. And we only
# want to use HTMLParser as a last result.
from .import _htmlparser
register_treebuilders_from(_htmlparser)
try:
from . import _lxml
register_treebuilders_from(_lxml)
except ImportError:
# They don't have lxml installed.
pass
try:
from . import _html5lib
register_treebuilders_from(_html5lib)
except ImportError:
# They don't have html5lib installed.
pass
try:
from . import _lxml
register_treebuilders_from(_lxml)
except ImportError:
# They don't have lxml installed.
pass
......@@ -9,7 +9,10 @@ from bs4.builder import (
HTMLTreeBuilder,
)
import html5lib
from html5lib.constants import DataLossWarning
from html5lib.constants import (
DataLossWarning,
namespaces,
)
import warnings
from bs4.element import (
Comment,
......@@ -26,7 +29,7 @@ class HTML5TreeBuilder(HTMLTreeBuilder):
def prepare_markup(self, markup, user_specified_encoding):
# Store the user-specified encoding for use later on.
self.user_specified_encoding = user_specified_encoding
return markup, None, None
return markup, None, None, False
# These methods are defined by Beautiful Soup.
def feed(self, markup):
......@@ -192,7 +195,10 @@ class Element(html5lib.treebuilders._base.Node):
def removeChild(self, node):
index = self._nodeIndex(node.parent, node)
del node.parent.element.contents[index]
# XXX This if statement is problematic:
# https://bugs.launchpad.net/beautifulsoup/+bug/838800
if index is not None:
del node.parent.element.contents[index]
node.element.parent = None
node.element.extract()
node.parent = None
......
......@@ -4,12 +4,22 @@ __all__ = [
'HTMLParserTreeBuilder',
]
try:
from html.parser import HTMLParser
CONSTRUCTOR_TAKES_STRICT = True
except ImportError, e:
from HTMLParser import HTMLParser
CONSTRUCTOR_TAKES_STRICT = False
from HTMLParser import HTMLParser
import sys
# Starting in Python 3.2, the HTMLParser constructor takes a 'strict'
# argument, which we'd like to set to False. Unfortunately,
# http://bugs.python.org/issue13273 makes strict=True a better bet
# before Python 3.2.3.
#
# At the end of this file, we monkeypatch HTMLParser so that
# strict=True works well on Python 3.2.2.
major, minor, release = sys.version_info[:3]
CONSTRUCTOR_TAKES_STRICT = (
major > 3
or (major == 3 and minor > 2)
or (major == 3 and minor == 2 and release >= 3))
from bs4.element import (
CData,
Comment,
......@@ -35,22 +45,24 @@ class HTMLParserTreeBuilder(HTMLParser, HTMLTreeBuilder):
def __init__(self, *args, **kwargs):
if CONSTRUCTOR_TAKES_STRICT:
kwargs['strict'] = True
kwargs['strict'] = False
return super(HTMLParserTreeBuilder, self).__init__(*args, **kwargs)
def prepare_markup(self, markup, user_specified_encoding=None,
document_declared_encoding=None):
"""
:return: A 3-tuple (markup, original encoding, encoding
declared within markup).
:return: A 4-tuple (markup, original encoding, encoding
declared within markup, whether any characters had to be
replaced with REPLACEMENT CHARACTER).
"""
if isinstance(markup, unicode):
return markup, None, None
return markup, None, None, False
try_encodings = [user_specified_encoding, document_declared_encoding]
dammit = UnicodeDammit(markup, try_encodings, isHTML=True)
dammit = UnicodeDammit(markup, try_encodings, is_html=True)
return (dammit.markup, dammit.original_encoding,
dammit.declared_html_encoding)
dammit.declared_html_encoding,
dammit.contains_replacement_characters)
def feed(self, markup):
super(HTMLParserTreeBuilder, self).feed(markup)
......@@ -108,3 +120,96 @@ class HTMLParserTreeBuilder(HTMLParser, HTMLTreeBuilder):
self.soup.handle_data(data)
self.soup.endData(ProcessingInstruction)
# Patch 3.2 versions of HTMLParser earlier than 3.2.3 to use some
# 3.2.3 code. This ensures they don't treat markup like <p></p> as a
# string.
#
# XXX This code can be removed once most Python 3 users are on 3.2.3.
if major == 3 and minor == 2 and not CONSTRUCTOR_TAKES_STRICT:
import re
attrfind_tolerant = re.compile(
r'\s*((?<=[\'"\s])[^\s/>][^\s/=>]*)(\s*=+\s*'
r'(\'[^\']*\'|"[^"]*"|(?![\'"])[^>\s]*))?')
HTMLParserTreeBuilder.attrfind_tolerant = attrfind_tolerant
locatestarttagend = re.compile(r"""
<[a-zA-Z][-.a-zA-Z0-9:_]* # tag name
(?:\s+ # whitespace before attribute name
(?:[a-zA-Z_][-.:a-zA-Z0-9_]* # attribute name
(?:\s*=\s* # value indicator
(?:'[^']*' # LITA-enclosed value
|\"[^\"]*\" # LIT-enclosed value
|[^'\">\s]+ # bare value
)
)?
)
)*
\s* # trailing whitespace
""", re.VERBOSE)
HTMLParserTreeBuilder.locatestarttagend = locatestarttagend
from html.parser import tagfind, attrfind
def parse_starttag(self, i):
self.__starttag_text = None
endpos = self.check_for_whole_start_tag(i)
if endpos < 0:
return endpos
rawdata = self.rawdata
self.__starttag_text = rawdata[i:endpos]
# Now parse the data between i+1 and j into a tag and attrs
attrs = []
match = tagfind.match(rawdata, i+1)
assert match, 'unexpected call to parse_starttag()'
k = match.end()
self.lasttag = tag = rawdata[i+1:k].lower()
while k < endpos:
if self.strict:
m = attrfind.match(rawdata, k)
else:
m = attrfind_tolerant.match(rawdata, k)
if not m:
break
attrname, rest, attrvalue = m.group(1, 2, 3)
if not rest:
attrvalue = None
elif attrvalue[:1] == '\'' == attrvalue[-1:] or \
attrvalue[:1] == '"' == attrvalue[-1:]:
attrvalue = attrvalue[1:-1]
if attrvalue:
attrvalue = self.unescape(attrvalue)
attrs.append((attrname.lower(), attrvalue))
k = m.end()
end = rawdata[k:endpos].strip()
if end not in (">", "/>"):
lineno, offset = self.getpos()
if "\n" in self.__starttag_text:
lineno = lineno + self.__starttag_text.count("\n")
offset = len(self.__starttag_text) \
- self.__starttag_text.rfind("\n")
else:
offset = offset + len(self.__starttag_text)
if self.strict:
self.error("junk characters in start tag: %r"
% (rawdata[k:endpos][:20],))
self.handle_data(rawdata[i:endpos])
return endpos
if end.endswith('/>'):
# XHTML-style empty tag: <span attr="value" />
self.handle_startendtag(tag, attrs)
else:
self.handle_starttag(tag, attrs)
if tag in self.CDATA_CONTENT_ELEMENTS:
self.set_cdata_mode(tag)
return endpos
def set_cdata_mode(self, elem):
self.cdata_elem = elem.lower()
self.interesting = re.compile(r'</\s*%s\s*>' % self.cdata_elem, re.I)
HTMLParserTreeBuilder.parse_starttag = parse_starttag
HTMLParserTreeBuilder.set_cdata_mode = set_cdata_mode
CONSTRUCTOR_TAKES_STRICT = True
......@@ -50,12 +50,13 @@ class LXMLTreeBuilderForXML(TreeBuilder):
declared within markup).
"""
if isinstance(markup, unicode):
return markup, None, None
return markup, None, None, False
try_encodings = [user_specified_encoding, document_declared_encoding]
dammit = UnicodeDammit(markup, try_encodings, isHTML=True)
dammit = UnicodeDammit(markup, try_encodings, is_html=True)
return (dammit.markup, dammit.original_encoding,
dammit.declared_html_encoding)
dammit.declared_html_encoding,
dammit.contains_replacement_characters)
def feed(self, markup):
self.parser.feed(markup)
......
......@@ -27,6 +27,10 @@ try:
except ImportError:
pass
xml_encoding_re = re.compile(
'^<\?.*encoding=[\'"](.*?)[\'"].*\?>'.encode(), re.I)
html_meta_re = re.compile(
'<\s*meta[^>]+charset\s*=\s*["\']?([^>]*?)[ /;\'">]'.encode(), re.I)
class EntitySubstitution(object):
......@@ -165,17 +169,21 @@ class UnicodeDammit:
]
def __init__(self, markup, override_encodings=[],
smart_quotes_to=None, isHTML=False):
smart_quotes_to=None, is_html=False):
self.declared_html_encoding = None
self.markup, document_encoding, sniffed_encoding = \
self._detectEncoding(markup, isHTML)
self.smart_quotes_to = smart_quotes_to
self.tried_encodings = []
self.contains_replacement_characters = False
if markup == '' or isinstance(markup, unicode):
self.original_encoding = None
self.markup = markup
self.unicode_markup = unicode(markup)
self.original_encoding = None
return
self.markup, document_encoding, sniffed_encoding = \
self._detectEncoding(markup, is_html)
u = None
for proposed_encoding in (
override_encodings + [document_encoding, sniffed_encoding]):
......@@ -195,6 +203,20 @@ class UnicodeDammit:
if u:
break
# As an absolute last resort, try the encodings again with
# character replacement.
if not u:
for proposed_encoding in (
override_encodings + [
document_encoding, sniffed_encoding, "utf-8", "windows-1252"]):
if proposed_encoding != "ascii":
u = self._convert_from(proposed_encoding, "replace")
if u is not None:
self.contains_replacement_characters = True
break
# We could at this point force it to ASCII, but that would
# destroy so much data that I think giving up is better
self.unicode_markup = u
if not u:
self.original_encoding = None
......@@ -213,11 +235,11 @@ class UnicodeDammit:
sub = sub.encode()
return sub
def _convert_from(self, proposed):
def _convert_from(self, proposed, errors="strict"):
proposed = self.find_codec(proposed)
if not proposed or proposed in self.tried_encodings:
if not proposed or (proposed, errors) in self.tried_encodings:
return None
self.tried_encodings.append(proposed)
self.tried_encodings.append((proposed, errors))
markup = self.markup
# Convert smart quotes to HTML if coming from an encoding
......@@ -229,18 +251,19 @@ class UnicodeDammit:
markup = smart_quotes_compiled.sub(self._sub_ms_char, markup)
try:
# print "Trying to convert document to %s" % proposed
u = self._to_unicode(markup, proposed)
#print "Trying to convert document to %s (errors=%s)" % (
# proposed, errors)
u = self._to_unicode(markup, proposed, errors)
self.markup = u
self.original_encoding = proposed
except Exception as e:
# print "That didn't work!"
# print e
#print "That didn't work!"
#print e
return None
#print "Correct encoding: %s" % proposed
return self.markup
def _to_unicode(self, data, encoding):
def _to_unicode(self, data, encoding, errors="strict"):
'''Given a string and its encoding, decodes the string into Unicode.
%encoding is a string recognized by encodings.aliases'''
......@@ -262,10 +285,10 @@ class UnicodeDammit:
elif data[:4] == '\xff\xfe\x00\x00':
encoding = 'utf-32le'
data = data[4:]
newdata = unicode(data, encoding)
newdata = unicode(data, encoding, errors)
return newdata
def _detectEncoding(self, xml_data, isHTML=False):
def _detectEncoding(self, xml_data, is_html=False):
"""Given a document, tries to detect its XML encoding."""
xml_encoding = sniffed_xml_encoding = None
try:
......@@ -315,16 +338,13 @@ class UnicodeDammit:
pass
except:
xml_encoding_match = None
xml_encoding_re = '^<\?.*encoding=[\'"](.*?)[\'"].*\?>'.encode()
xml_encoding_match = re.compile(xml_encoding_re).match(xml_data)
if not xml_encoding_match and isHTML:
meta_re = '<\s*meta[^>]+charset=([^>]*?)[;\'">]'.encode()
regexp = re.compile(meta_re, re.I)
xml_encoding_match = regexp.search(xml_data)
xml_encoding_match = xml_encoding_re.match(xml_data)
if not xml_encoding_match and is_html:
xml_encoding_match = html_meta_re.search(xml_data)
if xml_encoding_match is not None:
xml_encoding = xml_encoding_match.groups()[0].decode(
'ascii').lower()
if isHTML:
if is_html:
self.declared_html_encoding = xml_encoding
if sniffed_xml_encoding and \
(xml_encoding in ('iso-10646-ucs-2', 'ucs-2', 'csunicode',
......
# Makefile for Sphinx documentation
#
# You can set these variables from the command line.
SPHINXOPTS =
SPHINXBUILD = sphinx-build
PAPER =
BUILDDIR = build
# Internal variables.
PAPEROPT_a4 = -D latex_paper_size=a4
PAPEROPT_letter = -D latex_paper_size=letter
ALLSPHINXOPTS = -d $(BUILDDIR)/doctrees $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) source
.PHONY: help clean html dirhtml singlehtml pickle json htmlhelp qthelp devhelp epub latex latexpdf text man changes linkcheck doctest
help:
@echo "Please use \`make <target>' where <target> is one of"
@echo " html to make standalone HTML files"
@echo " dirhtml to make HTML files named index.html in directories"
@echo " singlehtml to make a single large HTML file"
@echo " pickle to make pickle files"
@echo " json to make JSON files"
@echo " htmlhelp to make HTML files and a HTML help project"
@echo " qthelp to make HTML files and a qthelp project"
@echo " devhelp to make HTML files and a Devhelp project"
@echo " epub to make an epub"
@echo " latex to make LaTeX files, you can set PAPER=a4 or PAPER=letter"
@echo " latexpdf to make LaTeX files and run them through pdflatex"
@echo " text to make text files"
@echo " man to make manual pages"
@echo " changes to make an overview of all changed/added/deprecated items"
@echo " linkcheck to check all external links for integrity"
@echo " doctest to run all doctests embedded in the documentation (if enabled)"
clean:
-rm -rf $(BUILDDIR)/*
html:
$(SPHINXBUILD) -b html $(ALLSPHINXOPTS) $(BUILDDIR)/html
@echo
@echo "Build finished. The HTML pages are in $(BUILDDIR)/html."
dirhtml:
$(SPHINXBUILD) -b dirhtml $(ALLSPHINXOPTS) $(BUILDDIR)/dirhtml
@echo
@echo "Build finished. The HTML pages are in $(BUILDDIR)/dirhtml."
singlehtml:
$(SPHINXBUILD) -b singlehtml $(ALLSPHINXOPTS) $(BUILDDIR)/singlehtml
@echo
@echo "Build finished. The HTML page is in $(BUILDDIR)/singlehtml."
pickle:
$(SPHINXBUILD) -b pickle $(ALLSPHINXOPTS) $(BUILDDIR)/pickle
@echo
@echo "Build finished; now you can process the pickle files."
json:
$(SPHINXBUILD) -b json $(ALLSPHINXOPTS) $(BUILDDIR)/json
@echo
@echo "Build finished; now you can process the JSON files."
htmlhelp:
$(SPHINXBUILD) -b htmlhelp $(ALLSPHINXOPTS) $(BUILDDIR)/htmlhelp
@echo
@echo