Skip to content

Commit e0f1cf4

Browse files
authored
Update README.rst
1 parent 5878e30 commit e0f1cf4

File tree

1 file changed

+50
-55
lines changed

1 file changed

+50
-55
lines changed

README.rst

Lines changed: 50 additions & 55 deletions
Original file line numberDiff line numberDiff line change
@@ -32,28 +32,27 @@ AST. So we can really classify and understand what's going on in
3232
sections of Python bytecode.
3333

3434
Building on this, another thing that makes this different from other
35-
CPython bytecode decompilers is the ability to deparse just
36-
*fragments* of source code and give source-code information around a
37-
given bytecode offset.
35+
CPython bytecode decompilers can deparse just
36+
*fragments* of source code and give source-code information around a given bytecode offset.
3837

3938
I use the tree fragments to deparse fragments of code *at run time*
4039
inside my trepan_ debuggers_. For that, bytecode offsets are recorded
4140
and associated with fragments of the source code. This purpose,
4241
although compatible with the original intention, is yet a little bit
4342
different. See this_ for more information.
4443

45-
Python fragment deparsing given an instruction offset is useful in
44+
Python fragment deparsing, given an instruction offset, is useful in
4645
showing stack traces and can be incorporated into any program that
4746
wants to show a location in more detail than just a line number at
48-
runtime. This code can be also used when source-code information does
47+
runtime. This code can also be used when source code information does
4948
not exist and there is just bytecode. Again, my debuggers make use of
5049
this.
5150

52-
There were (and still are) a number of decompyle, uncompyle,
51+
There were (and still are) several decompyle, uncompyle,
5352
uncompyle2, uncompyle3 forks around. Many of them come basically from
5453
the same code base, and (almost?) all of them are no longer actively
55-
maintained. One was really good at decompiling Python 1.5-2.3, another
56-
really good at Python 2.7, but that only. Another handles Python 3.2
54+
maintained. One was really good at decompiling Python 1.5-2.3, another is really good at Python 2.7,
55+
but only that. Another handles Python 3.2
5756
only; another patched that and handled only 3.3. You get the
5857
idea. This code pulls all of these forks together and *moves
5958
forward*. There is some serious refactoring and cleanup in this code
@@ -62,29 +61,29 @@ on in decompyle3_.
6261

6362
This demonstrably does the best in decompiling Python across all
6463
Python versions. And even when there is another project that only
65-
provides decompilation for subset of Python versions, we generally do
64+
provides decompilation for a subset of Python versions, we generally do
6665
demonstrably better for those as well.
6766

6867
How can we tell? By taking Python bytecode that comes distributed with
69-
that version of Python and decompiling these. Among those that
68+
that version of Python and decompiling it. Among those that
7069
successfully decompile, we can then make sure the resulting programs
7170
are syntactically correct by running the Python interpreter for that
7271
bytecode version. Finally, in cases where the program has a test for
7372
itself, we can run the check on the decompiled code.
7473

75-
We use an automated processes to find bugs. In the issue trackers for
76-
other decompilers, you will find a number of bugs we've found along
77-
the way. Very few to none of them are fixed in the other decompilers.
74+
We use automated processes to find bugs. In the issue trackers for
75+
other decompilers, you will find several bugs we've found along
76+
the way. Very few of them are fixed in the other decompilers.
7877

7978
Requirements
8079
------------
8180

8281
The code in the git repository can be run from Python 2.4 to the
83-
latest Python version, with the exception of Python 3.0 through
84-
3.2. Volunteers are welcome to address these deficiencies if there a
82+
latest Python version, except Python 3.0 through
83+
3.2. Volunteers are welcome to address these deficiencies if there is a
8584
desire to do so.
8685

87-
The way it does this though is by segregating consecutive Python versions into
86+
The way it does this, though, is by segregating consecutive Python versions into
8887
git branches:
8988

9089
master
@@ -118,7 +117,7 @@ or::
118117

119118
$ python setup.py install # may need sudo
120119

121-
A GNU Makefile is also provided so :code:`make install` (possibly as root or
120+
A GNU Makefile is also provided, so :code:`make install` (possibly as root or
122121
sudo) will do the steps above.
123122

124123
Running Tests
@@ -128,7 +127,7 @@ Running Tests
128127

129128
make check
130129

131-
A GNU makefile has been added to smooth over setting running the right
130+
A GNU makefile has been added to smooth over setting up and running the right
132131
command, and running tests from fastest to slowest.
133132

134133
If you have remake_ installed, you can see the list of all tasks
@@ -153,87 +152,83 @@ For usage help:
153152
Verification
154153
------------
155154

156-
In older versions of Python it was possible to verify bytecode by
157-
decompiling bytecode, and then compiling using the Python interpreter
155+
In older versions of Python, it was possible to verify bytecode by
156+
decompiling it and then compiling using the Python interpreter
158157
for that bytecode version. Having done this, the bytecode produced
159-
could be compared with the original bytecode. However as Python's code
160-
generation got better, this no longer was feasible.
158+
could be compared with the original bytecode. However, as Python's code
159+
generation got better, this was no longer feasible.
161160

162161
If you want Python syntax verification of the correctness of the
163162
decompilation process, add the :code:`--syntax-verify` option. However since
164-
Python syntax changes, you should use this option if the bytecode is
163+
Python syntax changes. You should use this option if the bytecode is
165164
the right bytecode for the Python interpreter that will be checking
166165
the syntax.
167166

168-
You can also cross compare the results with another version of
169-
*uncompyle6* since there are sometimes regressions in decompiling
170-
specific bytecode as the overall quality improves.
167+
You can also cross-compare the results with another version of
168+
*uncompyle6* since there are sometimes regressions in decompiling specific bytecode, as the overall quality improves.
171169

172170
For Python 3.7 and 3.8, the code in decompyle3_ is generally
173171
better.
174172

175-
Or try specific another python decompiler like uncompyle2_, unpyc37_,
176-
or pycdc_. Since the later two work differently, bugs here often
173+
Or try another specific Python decompiler like uncompyle2_, unpyc37_,
174+
or pycdc_. Since the latter two work differently, bugs here often
177175
aren't in that, and vice versa.
178176

179177
There is an interesting class of these programs that is readily
180-
available give stronger verification: those programs that when run
178+
available to give stronger verification: those programs that, when run,
181179
test themselves. Our test suite includes these.
182180

183-
And Python comes with another a set of programs like this: its test
181+
And Python comes with another set of programs like this: its test
184182
suite for the standard library. We have some code in :code:`test/stdlib` to
185183
facilitate this kind of checking too.
186184

187185
Known Bugs/Restrictions
188186
-----------------------
189187

190-
The biggest known and possibly fixable (but hard) problem has to do
191-
with handling control flow. (Python has probably the most diverse and
188+
The biggest known and possibly fixable (but hard) problem has to do with handling control flow. (Python has probably the most diverse and
192189
screwy set of compound statements I've ever seen; there
193190
are "else" clauses on loops and try blocks that I suspect many
194191
programmers don't know about.)
195192

196193
All of the Python decompilers that I have looked at have problems
197-
decompiling Python's control flow. In some cases we can detect an
194+
decompiling Python's control flow. In some cases, we can detect an
198195
erroneous decompilation and report that.
199196

200197
Python support is pretty good for Python 2
201198

202-
On the lower end of Python versions, decompilation seems pretty good although
199+
On the lower end of Python versions, decompilation seems pretty good, although
203200
we don't have any automated testing in place for Python's distributed tests.
204-
Also, we don't have a Python interpreter for versions 1.6, and 2.0.
201+
Also, we don't have a Python interpreter for versions 1.6 and 2.0.
205202

206203
In the Python 3 series, Python support is strongest around 3.4 or
207204
3.3 and drops off as you move further away from those versions. Python
208-
3.0 is weird in that it in some ways resembles 2.6 more than it does
205+
3.0 is weird in that it, in some ways, resembles 2.6 more than it does
209206
3.1 or 2.7. Python 3.6 changes things drastically by using word codes
210207
rather than byte codes. As a result, the jump offset field in a jump
211-
instruction argument has been reduced. This makes the :code:`EXTENDED_ARG`
212-
instructions are now more prevalent in jump instruction; previously
208+
instruction argument has been reduced. This makes the :code:`EXTENDED_ARG` instructions now more prevalent in jump instructions; previously
213209
they had been rare. Perhaps to compensate for the additional
214210
:code:`EXTENDED_ARG` instructions, additional jump optimization has been
215-
added. So in sum handling control flow by ad hoc means as is currently
211+
added. So in sum handling control flow by ad hoc means, as is currently
216212
done is worse.
217213

218-
Between Python 3.5, 3.6, 3.7 there have been major changes to the
214+
Between Python 3.5, 3.6, 3.7, there have been major changes to the
219215
:code:`MAKE_FUNCTION` and :code:`CALL_FUNCTION` instructions.
220216

221217
Python 3.8 removes :code:`SETUP_LOOP`, :code:`SETUP_EXCEPT`,
222218
:code:`BREAK_LOOP`, and :code:`CONTINUE_LOOP`, instructions which may
223219
make control-flow detection harder, lacking the more sophisticated
224220
control-flow analysis that is planned. We'll see.
225221

226-
Currently not all Python magic numbers are supported. Specifically in
222+
Currently, not all Python magic numbers are supported. Specifically in
227223
some versions of Python, notably Python 3.6, the magic number has
228224
changes several times within a version.
229225

230-
**We support only released versions, not candidate versions.** Note
231-
however that the magic of a released version is usually the same as
226+
**We support only released versions, not candidate versions.** Note, however, that the magic of a released version is usually the same as
232227
the *last* candidate version prior to release.
233228

234229
There are also customized Python interpreters, notably Dropbox,
235230
which use their own magic and encrypt bytecode. With the exception of
236-
the Dropbox's old Python 2.5 interpreter this kind of thing is not
231+
Dropbox's old Python 2.5 interpreter, this kind of thing is not
237232
handled.
238233

239234
We also don't handle PJOrion_ or otherwise obfuscated code. For
@@ -245,20 +240,20 @@ Py2EXE_, although we can probably decompile the code after you extract
245240
the bytecode properly. `Pydeinstaller <https://github.com/charles-dyfis-net/pydeinstaller>`_ may help with unpacking Pyinstaller bundlers.
246241

247242
Handling pathologically long lists of expressions or statements is
248-
slow. We don't handle Cython_ or MicroPython which don't use bytecode.
243+
slow. We don't handle Cython_ or MicroPython, which don't use bytecode.
249244

250245
There are numerous bugs in decompilation. And that's true for every
251-
other CPython decompiler I have encountered, even the ones that
246+
other CPython decompilers I have encountered, even the ones that
252247
claimed to be "perfect" on some particular version like 2.4.
253248

254-
As Python progresses decompilation also gets harder because the
249+
As Python progresses, decompilation also gets harder because the
255250
compilation is more sophisticated and the language itself is more
256251
sophisticated. I suspect that attempts there will be fewer ad-hoc
257252
attempts like unpyc37_ (which is based on a 3.3 decompiler) simply
258253
because it is harder to do so. The good news, at least from my
259254
standpoint, is that I think I understand what's needed to address the
260-
problems in a more robust way. But right now until such time as
261-
project is better funded, I do not intend to make any serious effort
255+
problems in a more robust way. But right now, until such time as
256+
the project is better funded, I do not intend to make any serious effort
262257
to support Python versions 3.8 or 3.9, including bugs that might come
263258
in. I imagine at some point I may be interested in it.
264259

@@ -271,16 +266,16 @@ there aren't that many people who have been working on bug fixing.
271266
Some of the bugs in 3.7 and 3.8 are simply a matter of back-porting
272267
the fixes in *decompyle3*. Any volunteers?
273268

274-
You may run across a bug, that you want to report. Please do so after
269+
You may run across a bug that you want to report. Please do so after
275270
reading `How to report a bug
276271
<https://github.com/rocky/python-uncompyle6/blob/master/HOW-TO-REPORT-A-BUG.md>`_ and
277272
follow the `instructions when opening an issue <https://github.com/rocky/python-uncompyle6/issues/new?assignees=&labels=&template=bug-report.md>`_.
278273

279274
Be aware that it might not get my attention for a while. If you
280275
sponsor or support the project in some way, I'll prioritize your
281276
issues above the queue of other things I might be doing instead. In
282-
rare situtations, I can do a hand decompilation of bytecode for a fee.
283-
However this is expansive, usually beyond what most people are willing
277+
rare situations, I can do a hand decompilation of bytecode for a fee.
278+
However, this is expensive, usually beyond what most people are willing
284279
to spend.
285280

286281
See Also
@@ -290,13 +285,13 @@ See Also
290285
* https://github.com/rocky/python-decompile3 : Much smaller and more modern code, focusing on 3.7 and 3.8. Changes in that will get migrated back here.
291286
* https://code.google.com/archive/p/unpyc3/ : supports Python 3.2 only. The above projects use a different decompiling technique than what is used here. Currently unmaintained.
292287
* https://github.com/figment/unpyc3/ : fork of above, but supports Python 3.3 only. Includes some fixes like supporting function annotations. Currently unmaintained.
293-
* https://github.com/wibiti/uncompyle2 : supports Python 2.7 only, but does that fairly well. There are situations where :code:`uncompyle6` results are incorrect while :code:`uncompyle2` results are not, but more often uncompyle6 is correct when uncompyle2 is not. Because :code:`uncompyle6` adheres to accuracy over idiomatic Python, :code:`uncompyle2` can produce more natural-looking code when it is correct. Currently :code:`uncompyle2` is lightly maintained. See its issue `tracker <https://github.com/wibiti/uncompyle2/issues>`_ for more details.
288+
* https://github.com/wibiti/uncompyle2 : supports Python 2.7 only, but does that fairly well. There are situations where :code:`uncompyle6` results are incorrect, while :code:`uncompyle2` results are not, but more often uncompyle6 is correct when uncompyle2 is not. Because :code:`uncompyle6` adheres to accuracy over idiomatic Python, :code:`uncompyle2` can produce more natural-looking code when it is correct. Currently:code:`uncompyle2` is lightly maintained. See its issue `tracker <https://github.com/wibiti/uncompyle2/issues>`_ for more details.
294289
* `How to report a bug <https://github.com/rocky/python-uncompyle6/blob/master/HOW-TO-REPORT-A-BUG.md>`_
295290
* The HISTORY_ file.
296291
* https://github.com/rocky/python-xdis : Cross Python version disassembler
297292
* https://github.com/rocky/python-xasm : Cross Python version assembler
298-
* https://github.com/rocky/python-uncompyle6/wiki : Wiki Documents which describe the code and aspects of it in more detail
299-
* https://github.com/zrax/pycdc : The README for this C++ code says it aims to support all versions of Python. You can aim your slign shot for the moon too, but I doubt you are going to hit it. This code is best for Python versions around 2.7 and 3.3 when the code was initially developed. Accuracy for current versions of Python3 and early versions of Python is lacking. Without major effort, it is unlikely it can be made to support current Python 3. See its `issue tracker <https://github.com/zrax/pycdc/issues>`_ for details. Currently lightly maintained.
293+
* https://github.com/rocky/python-uncompyle6/wiki : Wiki Documents that describe the code and aspects of it in more detail
294+
* https://github.com/zrax/pycdc : The README for this C++ code says it aims to support all versions of Python. You can aim your slingshot for the moon, too, but I doubt you are going to hit it. This code is best for Python versions around 2.7 and 3.3, when the code was initially developed. Accuracy for current versions of Python 3 and early versions of Python is lacking. Without major effort, it is unlikely that it can be made to support the current Python 3. See its `issue tracker <https://github.com/zrax/pycdc/issues>`_ for details. Currently lightly maintained.
300295

301296

302297
.. _Cython: https://en.wikipedia.org/wiki/Cython

0 commit comments

Comments
 (0)