Originally posted by Michael_S
View Post
Announcement
Collapse
No announcement yet.
RustPython Is Implementing Python 3 Within Rust
Collapse
X
-
- Likes 2
-
Originally posted by jacob View Post
Correction: reference counting is not slow at all, that's precisely the reason why it's often preferred to a full blown GC. I don't know what CPython uses internally but somehow I doubt it's pure reference counting, firstly because the well known limitation of reference counting is that it can't handle reference cycles and to my knowledge, Python does not suffer from that problem. And also because a well implemented reference counting memory manager can use atomic instructions to increment/decrement the counters and shouldn't require thread locking, much less a GIL.
As ssokolow states further up, regular Python actually uses a mix of reference counting plus a tracing GC to detect cycles.
As Guest also stated further up, "optimized reference counting isn't necessarily slow because an advanced compiler can sometimes elide refcount updates."
But since Python is a scripting language that isn't precompiled into executables, so all it can apply is simple reference counting, and that's inescapably slow.
So maybe we have, very broadly, a three level speed hierarchy? Simple reference counting as in Python is generally slower than in tracing garbage collection as in the JVM or .NET, which in turn are generally slower than optimized reference counting as in Objective-C or Swift. Or I suppose you could call manual memory management in C, C++, or the compiler-enforced memory management in Rust a form of "optimized reference counting".
- Likes 1
Comment
-
Originally posted by Michael_S View PostSo maybe we have, very broadly, a three level speed hierarchy? Simple reference counting as in Python is generally slower than in tracing garbage collection as in the JVM or .NET, which in turn are generally slower than optimized reference counting as in Objective-C or Swift. Or I suppose you could call manual memory management in C, C++, or the compiler-enforced memory management in Rust a form of "optimized reference counting".
Languages like Java, C#, and Swift can be faster because they've got JIT compilers, static type systems, and they don't allow you to rewrite pretty much anything about the program on the fly as it runs. Python has to dispatch everything dynamically and needs the GIL to ensure that one thread doesn't pull the rug out from under another by doing something like redefining a function.
Refcounting vs. tracing GC actually isn't a clear-cut thing when all else is equal. Reference counts are individually much lighter than a GC pass, but a full-blown tracing GC is often much faster overall because it amortizes the cost of memory management. (ie. The longer you go between GC collection passes, the better the performance but the worse the peak memory consumption.)Last edited by ssokolow; 07 February 2019, 12:16 AM.
- Likes 2
Comment
-
Originally posted by atomsymbolJust a note:
In case of emulator-like programs, such as the RustPython interpreter written in Rust, reachability analysis on the implementation of RustPython when Rust is compiling the implementation is impossible because of the halting problem. Once a program written in Rust allows general-purpose computation to be input as data to the program (= bytecode input to interpreters, emulators), compile-time analysis is impossible because it depends on the behavior of the unknown universal input.
If you can impose constraints on the programmers, you can prove some things in a restricted problem space.
That's also why a borrow checker is the kind of thing that had to prove its worth in a new language, rather than by getting retrofitted onto an existing language. It requires code to be written with its limitations in mind, which means it gives a poor showing when you have a large body of preexisting code.
- Likes 1
Comment
-
Originally posted by atomsymbolThat seems too pessimistic to me about Python. In my opinion, in a loose sense of safety for concurrency, the interpreter can perform tasks in parallel as long as it does not crash by itself because of that. No pure Python code should in theory crash the interpreter/VM. However, this allows for Python codes to behave non-deterministically or "unexpectedly" - and if the programmer wants for the program to behave according to his/her intentions the programmer is responsible for inserting calls to concurrency primitives such as locking/unlocking mutexes, communication via channels (if the language supports channels), performing atomic updates, or making per-thread copies of data structures.
In my opinion, it is wrong to demand for concurrent Python codes to behave just like sequential ones. As long as parallel execution does not crash the interpreter/VM then it should be considered a valid Python implementation.
Concurrent Python implementation would need to choose a memory model, like Java or Go did.
- Likes 1
Comment
-
Originally posted by jacob View Post
Correction: reference counting is not slow at all, that's precisely the reason why it's often preferred to a full blown GC. I don't know what CPython uses internally but somehow I doubt it's pure reference counting, firstly because the well known limitation of reference counting is that it can't handle reference cycles and to my knowledge, Python does not suffer from that problem. And also because a well implemented reference counting memory manager can use atomic instructions to increment/decrement the counters and shouldn't require thread locking, much less a GIL.
Python uses pure reference counting and the "atomicity" is obtained through the GIL.
Also python suffers from reference cycles as any other implementation of software that uses reference counters: it has the garbage collector which heuristically traverses through those objects which are called containers (ie: can contain other objects). Tuples, lists, dictionaries, etc... are containers and in their implementation they have a flag that declares them as so; also they have special methods to traverse the contained objects in search of reference cycles. Integers, strings, floats, etc... cannot contain other objects so they don't need to be tracked by a garbage collector, but they are just handled by the reference counting.
Comment
-
Originally posted by blackshard View PostReference counting is not slow at all: it is just a matter of increasing and decreasing a counter and releasing memory when the reference counter reaches zero
Comment
-
Originally posted by ssokolow View Post
That's actually the point I was making when I said GC can be faster because it amortizes costs. All those little increments and decrements add up to the point that a well-tuned GC can be faster because the cost of one GC pass can be less than the cost of all the increments and decrements that it's an alternative to.
Comment
Comment