Commit 3da92cb0 authored by Stefano Rivera's avatar Stefano Rivera

New upstream version 2.18

parent 73d0dd76
+ Version 2.18 (04.07.2017)
- PR #161 & #184: Update bundled PLY version to 3.10
- PR #158: Add support for the __int128 type.
- PR #169: Handle more tricky TYPEID in declarators.
- PR #178: Add columns to the coord of each node
+ Version 2.17 (29.10.2016)
- Again functionality identical to 2.15 and 2.16; the difference is that the
......
Copyright (c) 2008-2016, Eli Bendersky
pycparser -- A C parser in Python
Copyright (c) 2008-2017, Eli Bendersky
All rights reserved.
Redistribution and use in source and binary forms, with or without modification,
......
Metadata-Version: 1.1
Name: pycparser
Version: 2.17
Version: 2.18
Summary: C parser in Python
Home-page: https://github.com/eliben/pycparser
Author: Eli Bendersky
......
===============
pycparser v2.17
pycparser v2.18
===============
:Author: `Eli Bendersky <http://eli.thegreenplace.net>`_
......@@ -57,20 +57,20 @@ things up so that it parses code with a lot of GCC-isms successfully. See the
What grammar does pycparser follow?
-----------------------------------
**pycparser** very closely follows the C grammar provided in the end of the C99
standard document
**pycparser** very closely follows the C grammar provided in Annex A of the C99
standard (ISO/IEC 9899).
How is pycparser licensed?
--------------------------
BSD license. See the `LICENSE` file in the distribution.
`BSD license <https://github.com/eliben/pycparser/blob/master/LICENSE>`_.
Contact details
---------------
Drop me an email to eliben@gmail.com for any questions regarding **pycparser**.
For reporting problems with **pycparser** or submitting feature requests, the
best way is to open an `issue <https://github.com/eliben/pycparser/issues>`_.
For reporting problems with **pycparser** or submitting feature requests, please
open an `issue <https://github.com/eliben/pycparser/issues>`_, or submit a
pull request.
Installing
......@@ -85,7 +85,7 @@ Prerequisites
* **pycparser** has no external dependencies. The only non-stdlib library it
uses is PLY, which is bundled in ``pycparser/ply``. The current PLY version is
3.8, retrieved from `<http://www.dabeaz.com/ply/ply-3.8.tar.gz>`_
3.10, retrieved from `<http://www.dabeaz.com/ply/>`_
Installation process
--------------------
......@@ -111,6 +111,7 @@ Known problems
deleting the ``pycparser`` directory in your Python's ``site-packages``, or
wherever you installed it) and install again.
Using
=====
......@@ -119,10 +120,10 @@ Interaction with the C preprocessor
In order to be compilable, C code must be preprocessed by the C preprocessor -
``cpp``. ``cpp`` handles preprocessing directives like ``#include`` and
``#define``, removes comments, and does other minor tasks that prepare the C
``#define``, removes comments, and performs other minor tasks that prepare the C
code for compilation.
For all but the most trivial snippets of C code, **pycparser**, like a C
For all but the most trivial snippets of C code **pycparser**, like a C
compiler, must receive preprocessed C code in order to function correctly. If
you import the top-level ``parse_file`` function from the **pycparser** package,
it will interact with ``cpp`` for you, as long as it's in your PATH, or you
......@@ -136,13 +137,13 @@ and install a binary build of Clang for Windows `from this website
What about the standard C library headers?
------------------------------------------
C code almost always includes various header files from the standard C library,
like ``stdio.h``. While, with some effort, **pycparser** can be made to parse
the standard headers from any C compiler, it's much simpler to use the provided
"fake" standard includes in ``utils/fake_libc_include``. These are standard C
header files that contain only the bare necessities to allow valid parsing of
the files that use them. As a bonus, since they're minimal, it can significantly
improve the performance of parsing large C files.
C code almost always ``#include``\s various header files from the standard C
library, like ``stdio.h``. While (with some effort) **pycparser** can be made to
parse the standard headers from any C compiler, it's much simpler to use the
provided "fake" standard includes in ``utils/fake_libc_include``. These are
standard C header files that contain only the bare necessities to allow valid
parsing of the files that use them. As a bonus, since they're minimal, it can
significantly improve the performance of parsing large C files.
The key point to understand here is that **pycparser** doesn't really care about
the semantics of types. It only needs to know whether some token encountered in
......@@ -169,6 +170,7 @@ created by the parser, see ``pycparser/_c_ast.cfg``.
There's also a `FAQ available here <https://github.com/eliben/pycparser/wiki/FAQ>`_.
In any case, you can always drop me an `email <eliben@gmail.com>`_ for help.
Modifying
=========
......@@ -213,15 +215,17 @@ utils/fake_libc_include:
utils/internal/:
Internal utilities for my own use. You probably don't need them.
Contributors
============
Some people have contributed to **pycparser** by opening issues on bugs they've
found and/or submitting patches. The list of contributors is in the CONTRIBUTORS
file in the source distribution. Once **pycparser** moved to Github, I stopped
file in the source distribution. After **pycparser** moved to Github I stopped
updating this list because Github does a much better job at tracking
contributions.
CI Status
=========
......@@ -232,3 +236,8 @@ CI Status
:align: center
:target: https://travis-ci.org/eliben/pycparser
AppVeyor also helps run tests on Windows:
.. image:: https://ci.appveyor.com/api/projects/status/wrup68o5y8nuk1i9?svg=true
:align: center
:target: https://ci.appveyor.com/project/eliben/pycparser/
......@@ -4,7 +4,7 @@
# Example of using pycparser.c_generator, serving as a simplistic translator
# from C to AST and back to C.
#
# Copyright (C) 2008-2015, Eli Bendersky
# Eli Bendersky [http://eli.thegreenplace.net]
# License: BSD
#------------------------------------------------------------------------------
from __future__ import print_function
......
#------------------------------------------------------------------------------
# pycparser: c_json.py
#
# by Michael White (@mypalmike)
#
# This example includes functions to serialize and deserialize an ast
# to and from json format. Serializing involves walking the ast and converting
# each node from a python Node object into a python dict. Deserializing
# involves the opposite conversion, walking the tree formed by the
# dict and converting each dict into the specific Node object it represents.
# The dict itself is serialized and deserialized using the python json module.
#
# The dict representation is a fairly direct transformation of the object
# attributes. Each node in the dict gets one metadata field referring to the
# specific node class name, _nodetype. Each local attribute (i.e. not linking
# to child nodes) has a string value or array of string values. Each child
# attribute is either another dict or an array of dicts, exactly as in the
# Node object representation. The "coord" attribute, representing the
# node's location within the source code, is serialized/deserialized from
# a Coord object into a string of the format "filename:line[:column]".
#
# Example TypeDecl node, with IdentifierType child node, represented as a dict:
# "type": {
# "_nodetype": "TypeDecl",
# "coord": "c_files/funky.c:8",
# "declname": "o",
# "quals": [],
# "type": {
# "_nodetype": "IdentifierType",
# "coord": "c_files/funky.c:8",
# "names": [
# "char"
# ]
# }
# }
#------------------------------------------------------------------------------
from __future__ import print_function
import json
import sys
import re
# This is not required if you've installed pycparser into
# your site-packages/ with setup.py
#
sys.path.extend(['.', '..'])
from pycparser import parse_file, c_ast
from pycparser.plyparser import Coord
RE_CHILD_ARRAY = re.compile(r'(.*)\[(.*)\]')
RE_INTERNAL_ATTR = re.compile('__.*__')
class CJsonError(Exception):
pass
def memodict(fn):
""" Fast memoization decorator for a function taking a single argument """
class memodict(dict):
def __missing__(self, key):
ret = self[key] = fn(key)
return ret
return memodict().__getitem__
@memodict
def child_attrs_of(klass):
"""
Given a Node class, get a set of child attrs.
Memoized to avoid highly repetitive string manipulation
"""
non_child_attrs = set(klass.attr_names)
all_attrs = set([i for i in klass.__slots__ if not RE_INTERNAL_ATTR.match(i)])
return all_attrs - non_child_attrs
def to_dict(node):
""" Recursively convert an ast into dict representation. """
klass = node.__class__
result = {}
# Metadata
result['_nodetype'] = klass.__name__
# Local node attributes
for attr in klass.attr_names:
result[attr] = getattr(node, attr)
# Coord object
if node.coord:
result['coord'] = str(node.coord)
else:
result['coord'] = None
# Child attributes
for child_name, child in node.children():
# Child strings are either simple (e.g. 'value') or arrays (e.g. 'block_items[1]')
match = RE_CHILD_ARRAY.match(child_name)
if match:
array_name, array_index = match.groups()
array_index = int(array_index)
# arrays come in order, so we verify and append.
result[array_name] = result.get(array_name, [])
if array_index != len(result[array_name]):
raise CJsonError('Internal ast error. Array {} out of order. '
'Expected index {}, got {}'.format(
array_name, len(result[array_name]), array_index))
result[array_name].append(to_dict(child))
else:
result[child_name] = to_dict(child)
# Any child attributes that were missing need "None" values in the json.
for child_attr in child_attrs_of(klass):
if child_attr not in result:
result[child_attr] = None
return result
def to_json(node, **kwargs):
""" Convert ast node to json string """
return json.dumps(to_dict(node), **kwargs)
def file_to_dict(filename):
""" Load C file into dict representation of ast """
ast = parse_file(filename, use_cpp=True)
return to_dict(ast)
def file_to_json(filename, **kwargs):
""" Load C file into json string representation of ast """
ast = parse_file(filename, use_cpp=True)
return to_json(ast, **kwargs)
def _parse_coord(coord_str):
""" Parse coord string (file:line[:column]) into Coord object. """
if coord_str is None:
return None
vals = coord_str.split(':')
vals.extend([None] * 3)
filename, line, column = vals[:3]
return Coord(filename, line, column)
def _convert_to_obj(value):
"""
Convert an object in the dict representation into an object.
Note: Mutually recursive with from_dict.
"""
value_type = type(value)
if value_type == dict:
return from_dict(value)
elif value_type == list:
return [_convert_to_obj(item) for item in value]
else:
# String
return value
def from_dict(node_dict):
""" Recursively build an ast from dict representation """
class_name = node_dict.pop('_nodetype')
klass = getattr(c_ast, class_name)
# Create a new dict containing the key-value pairs which we can pass
# to node constructors.
objs = {}
for key, value in node_dict.items():
if key == 'coord':
objs[key] = _parse_coord(value)
else:
objs[key] = _convert_to_obj(value)
# Use keyword parameters, which works thanks to beautifully consistent
# ast Node initializers.
return klass(**objs)
def from_json(ast_json):
""" Build an ast from json string representation """
return from_dict(json.loads(ast_json))
#------------------------------------------------------------------------------
if __name__ == "__main__":
if len(sys.argv) > 1:
# Some test code...
# Do trip from C -> ast -> dict -> ast -> json, then print.
ast_dict = file_to_dict(sys.argv[1])
ast = from_dict(ast_dict)
print(to_json(ast, sort_keys=True, indent=4))
else:
print("Please provide a filename as argument")
......@@ -6,19 +6,33 @@
#
# The AST generated by pycparser from the given declaration is traversed
# recursively to build the explanation. Note that the declaration must be a
# valid external declaration in C. All the types used in it must be defined with
# typedef, or parsing will fail. The definition can be arbitrary - pycparser
# doesn't really care what the type is defined to be, only that it's a type.
# valid external declaration in C. As shown below, typedef can be optionally
# expanded.
#
# For example:
#
# 'typedef int Node; const Node* (*ar)[10];'
# =>
# ar is a pointer to array[10] of pointer to const Node
# c_decl = 'typedef int Node; const Node* (*ar)[10];'
#
# Copyright (C) 2008-2015, Eli Bendersky
# explain_c_declaration(c_decl)
# => ar is a pointer to array[10] of pointer to const Node
#
# struct and typedef can be optionally expanded:
#
# explain_c_declaration(c_decl, expand_typedef=True)
# => ar is a pointer to array[10] of pointer to const int
#
# c_decl = 'struct P {int x; int y;} p;'
#
# explain_c_declaration(c_decl)
# => p is a struct P
#
# explain_c_declaration(c_decl, expand_struct=True)
# => p is a struct P containing {x is a int, y is a int}
#
# Eli Bendersky [http://eli.thegreenplace.net]
# License: BSD
#-----------------------------------------------------------------
import copy
import sys
# This is not required if you've installed pycparser into
......@@ -29,12 +43,15 @@ sys.path.extend(['.', '..'])
from pycparser import c_parser, c_ast
def explain_c_declaration(c_decl):
def explain_c_declaration(c_decl, expand_struct=False, expand_typedef=False):
""" Parses the declaration in c_decl and returns a text
explanation as a string.
The last external node of the string is used, to allow
earlier typedefs for used types.
The last external node of the string is used, to allow earlier typedefs
for used types.
expand_struct=True will spell out struct definitions recursively.
expand_typedef=True will expand typedef'd types.
"""
parser = c_parser.CParser()
......@@ -49,7 +66,14 @@ def explain_c_declaration(c_decl):
):
return "Not a valid declaration"
return _explain_decl_node(node.ext[-1])
try:
expanded = expand_struct_typedef(node.ext[-1], node,
expand_struct=expand_struct,
expand_typedef=expand_typedef)
except Exception as e:
return "Not a valid declaration: " + str(e)
return _explain_decl_node(expanded)
def _explain_decl_node(decl_node):
......@@ -95,6 +119,75 @@ def _explain_type(decl):
return ('function(%s) returning ' % (args) +
_explain_type(decl.type))
elif typ == c_ast.Struct:
decls = [_explain_decl_node(mem_decl) for mem_decl in decl.decls]
members = ', '.join(decls)
return ('struct%s ' % (' ' + decl.name if decl.name else '') +
('containing {%s}' % members if members else ''))
def expand_struct_typedef(cdecl, file_ast,
expand_struct=False,
expand_typedef=False):
"""Expand struct & typedef and return a new expanded node."""
decl_copy = copy.deepcopy(cdecl)
_expand_in_place(decl_copy, file_ast, expand_struct, expand_typedef)
return decl_copy
def _expand_in_place(decl, file_ast, expand_struct=False, expand_typedef=False):
"""Recursively expand struct & typedef in place, throw RuntimeError if
undeclared struct or typedef are used
"""
typ = type(decl)
if typ in (c_ast.Decl, c_ast.TypeDecl, c_ast.PtrDecl, c_ast.ArrayDecl):
decl.type = _expand_in_place(decl.type, file_ast, expand_struct,
expand_typedef)
elif typ == c_ast.Struct:
if not decl.decls:
struct = _find_struct(decl.name, file_ast)
if not struct:
raise RuntimeError('using undeclared struct %s' % decl.name)
decl.decls = struct.decls
for i, mem_decl in enumerate(decl.decls):
decl.decls[i] = _expand_in_place(mem_decl, file_ast, expand_struct,
expand_typedef)
if not expand_struct:
decl.decls = []
elif (typ == c_ast.IdentifierType and
decl.names[0] not in ('int', 'char')):
typedef = _find_typedef(decl.names[0], file_ast)
if not typedef:
raise RuntimeError('using undeclared type %s' % decl.names[0])
if expand_typedef:
return typedef.type
return decl
def _find_struct(name, file_ast):
"""Receives a struct name and return declared struct object in file_ast
"""
for node in file_ast.ext:
if (type(node) == c_ast.Decl and
type(node.type) == c_ast.Struct and
node.type.name == name):
return node.type
def _find_typedef(name, file_ast):
"""Receives a type name and return typedef object in file_ast
"""
for node in file_ast.ext:
if type(node) == c_ast.Typedef and node.name == name:
return node
if __name__ == "__main__":
if len(sys.argv) > 1:
......
#-----------------------------------------------------------------
# pycparser: dump_ast.py
#
# Basic example of parsing a file and dumping its parsed AST.
#
# Eli Bendersky [http://eli.thegreenplace.net]
# License: BSD
#-----------------------------------------------------------------
from __future__ import print_function
import argparse
import sys
# This is not required if you've installed pycparser into
# your site-packages/ with setup.py
sys.path.extend(['.', '..'])
from pycparser import c_parser, c_ast, parse_file
if __name__ == "__main__":
argparser = argparse.ArgumentParser('Dump AST')
argparser.add_argument('filename', help='name of file to parse')
args = argparser.parse_args()
ast = parse_file(args.filename, use_cpp=False)
ast.show()
......@@ -9,7 +9,7 @@
# information from the AST.
# It helps to have the pycparser/_c_ast.cfg file in front of you.
#
# Copyright (C) 2008-2015, Eli Bendersky
# Eli Bendersky [http://eli.thegreenplace.net]
# License: BSD
#-----------------------------------------------------------------
from __future__ import print_function
......@@ -32,7 +32,7 @@ from pycparser import c_parser, c_ast
# to, so I've inserted the dummy typedef in the code to let the
# parser know Hash and Node are types. You don't need to do it
# when parsing real, correct C code.
#
text = r"""
typedef int Node, Hash;
......@@ -66,8 +66,8 @@ ast = parser.parse(text, filename='<none>')
# readable way. show() is the most useful tool in exploring ASTs
# created by pycparser. See the c_ast.py file for the options you
# can pass it.
#
#~ ast.show()
#ast.show(showcoord=True)
# OK, we've seen that the top node is FileAST. This is always the
# top node of the AST. Its children are "external declarations",
......@@ -79,48 +79,49 @@ ast = parser.parse(text, filename='<none>')
# ext[] holds the children of FileAST. Since the function
# definition is the third child, it's ext[2]. Uncomment the
# following line to show it:
#
#~ ast.ext[2].show()
#ast.ext[2].show()
# A FuncDef consists of a declaration, a list of parameter
# declarations (for K&R style function definitions), and a body.
# First, let's examine the declaration.
#
function_decl = ast.ext[2].decl
# function_decl, like any other declaration, is a Decl. Its type child
# is a FuncDecl, which has a return type and arguments stored in a
# ParamList node
#~ function_decl.type.show()
#~ function_decl.type.args.show()
#function_decl.type.show()
#function_decl.type.args.show()
# The following displays the name and type of each argument:
#
#~ for param_decl in function_decl.type.args.params:
#~ print('Arg name: %s' % param_decl.name)
#~ print('Type:')
#~ param_decl.type.show(offset=6)
#for param_decl in function_decl.type.args.params:
#print('Arg name: %s' % param_decl.name)
#print('Type:')
#param_decl.type.show(offset=6)
# The body is of FuncDef is a Compound, which is a placeholder for a block
# surrounded by {} (You should be reading _c_ast.cfg parallel to this
# explanation and seeing these things with your own eyes).
# Let's see the block's declarations:
#
function_body = ast.ext[2].body
# The following displays the declarations and statements in the function
# body
#
#~ for decl in function_body.block_items:
#~ decl.show()
#for decl in function_body.block_items:
#decl.show()
# We can see a single variable declaration, i, declared to be a simple type
# declaration of type 'unsigned int', followed by statements.
# block_items is a list, so the third element is the For statement:
#
for_stmt = function_body.block_items[2]
#~ for_stmt.show()
#for_stmt.show()
# As you can see in _c_ast.cfg, For's children are 'init, cond,
# next' for the respective parts of the 'for' loop specifier,
......@@ -128,26 +129,26 @@ for_stmt = function_body.block_items[2]
# a block.
#
# Let's dig deeper, to the while statement inside the for loop:
#
while_stmt = for_stmt.stmt.block_items[1]
#~ while_stmt.show()
#while_stmt.show()
# While is simpler, it only has a condition node and a stmt node.
# The condition:
#
while_cond = while_stmt.cond
#~ while_cond.show()
#while_cond.show()
# Note that it's a BinaryOp node - the basic constituent of
# expressions in our AST. BinaryOp is the expression tree, with
# left and right nodes as children. It also has the op attribute,
# which is just the string representation of the operator.
#
#~ print(while_cond.op)
#~ while_cond.left.show()
#~ while_cond.right.show()
#
#print(while_cond.op)
#while_cond.left.show()
#while_cond.right.show()
# That's it for the example. I hope you now see how easy it is to explore the
# AST created by pycparser. Although on the surface it is quite complex and has
# a lot of node types, this is the inherent complexity of the C language every
......@@ -156,6 +157,3 @@ while_cond = while_stmt.cond
# structure of AST nodes and write code that processes them.
# Specifically, see the cdecl.py example for a non-trivial demonstration of what
# you can do by recursively going through the AST.
#
#-----------------------------------------------------------------
# pycparser: func_defs.py
# pycparser: func_calls.py
#
# Using pycparser for printing out all the calls of some function
# in a C file.
#
# Copyright (C) 2008-2015, Eli Bendersky
# Eli Bendersky [http://eli.thegreenplace.net]
# License: BSD
#-----------------------------------------------------------------
from __future__ import print_function
......
......@@ -7,7 +7,7 @@
# This is a simple example of traversing the AST generated by
# pycparser. Call it from the root directory of pycparser.
#
# Copyright (C) 2008-2015, Eli Bendersky
# Eli Bendersky [http://eli.thegreenplace.net]
# License: BSD
#-----------------------------------------------------------------
from __future__ import print_function
......
#-----------------------------------------------------------------
# pycparser: func_write.py
# pycparser: rewrite_ast.py
#
# Tiny example of rewriting a AST node
#
# Copyright (C) 2014, Akira Hayakawa
# Eli Bendersky [http://eli.thegreenplace.net]
# License: BSD
#-----------------------------------------------------------------
from __future__ import print_function
......
#-----------------------------------------------------------------
# pycparser: serialize_ast.py
#
# Simple example of serializing AST
#
# Hart Chu [https://github.com/CtheSky]
# Eli Bendersky [http://eli.thegreenplace.net]
# License: BSD
#-----------------------------------------------------------------
from __future__ import print_function
import pickle
from pycparser import c_parser
text = r"""
void func(void)
{
x = 1;
}
"""
parser = c_parser.CParser()
ast = parser.parse(text)
# Since AST nodes use __slots__ for faster attribute access and
# space saving, it needs Pickle's protocol version >= 2.
# The default version is 3 for python 3.x and 1 for python 2.7.
# You can always select the highest available protocol with the -1 argument.