Originally posted by Ansla
View Post
Announcement
Collapse
No announcement yet.
Mono 2.11 Release Brings Many Changes
Collapse
X
-
Originally posted by ciplogic View PostJava slow?
I made a similar code in Java and C++ (I will do C# if needed)
As you noticed the Java code does solve the same computation and uses Integer value (which boxes an int "primitive type") in 114 ms against VS 2010 release mode C++ which needed 20685 ms. Changing the line:
smart_pointers.cpp
Code:#include <vector> #include <memory> #include <boost/smart_ptr.hpp> #ifndef POINTER #define POINTER Item* //#define POINTER std::shared_ptr<Item> //#define POINTER boost::shared_ptr<Item> #endif struct Item { int x; int y; }; std::vector<POINTER> items; __attribute__ ((noinline)) void ProcessItem(POINTER item) {} int main() { for(int i = 0; i < 10; i++) { POINTER x(new Item); items.push_back(x); } for(int i = 0; i < 100000000; i++) for(auto it = items.begin(); it != items.end(); ++it) ProcessItem(*it); return 0; }
Code:using System.Collections.Generic; class Item { int x; int y; }; class Dummy { static List<Item> items = new List<Item>(); static void ProcessItem(Item item) { } static int Main(string[] args) { for(int i = 0; i < 10; i++) { items.Add(new Item()); } for(int i = 0; i < 100000000; i++) foreach(Item item in items) ProcessItem(item); return 0; } }
Code:import java.util.*; class Item { int x; int y; }; class Dummy { static Vector<Item> items = new Vector<Item>(); static void ProcessItem(Item item) { } public static void main(String[] args) { for(int i = 0; i < 10; i++) { items.add(new Item()); } for(int i = 0; i < 100000000; i++) for(Item item : items) ProcessItem(item); } }
Code:CXXFLAGS=-std=c++0x -O2 #clang chokes on STL headers with c++0x CLANG_EXTRA_FLAGS=-std=c++98 LDFLAGS=-lboost_system all: execute gccregular: smart_pointers.cpp g++ -o $@ $(CXXFLAGS) $(LDFLAGS) $^ gccsmart: smart_pointers.cpp g++ -o $@ -DPOINTER=std::shared_ptr\<Item\> $(CXXFLAGS) $(LDFLAGS) $^ gccboost: smart_pointers.cpp g++ -o $@ -DPOINTER=boost::shared_ptr\<Item\> $(CXXFLAGS) $(LDFLAGS) $^ clangregular: smart_pointers.cpp clang++ -o $@ $(CXXFLAGS) $(CLANG_EXTRA_FLAGS) $(LDFLAGS) $^ clangsmart: smart_pointers.cpp clang++ -o $@ -DPOINTER=std::shared_ptr\<Item\> $(CXXFLAGS) $(CLANG_EXTRA_FLAGS) $(LDFLAGS) $^ clangboost: smart_pointers.cpp clang++ -o $@ -DPOINTER=boost::shared_ptr\<Item\> $(CXXFLAGS) $(CLANG_EXTRA_FLAGS) $(LDFLAGS) $^ smart_pointers.exe: smart_pointers.cs mcs $^ Dummy.class: smart_pointers.java javac $^ gcj: smart_pointers.java gcj -o $@ --main=Dummy $^ execute: gccregular gccsmart gccboost clangregular clangboost smart_pointers.exe Dummy.class gcj echo -e \\ngcc with regular pointers time ./gccregular echo -e \\ngcc with boehm-gc LD_PRELOAD=/usr/lib/libgc.so time ./gccregular echo -e \\ngcc with native smart pointers time ./gccsmart echo -e \\ngcc with boost smart pointers time ./gccboost echo -e \\nclang with regular pointers time ./clangregular echo -e \\nclang with boehm-gc LD_PRELOAD=/usr/lib/libgc.so time ./clangregular #echo -e \\nclang with native smart pointers #time ./clangsmart echo -e \\nclang with boost smart pointers time ./clangboost echo -e \\nMono: time mono smart_pointers.exe echo -e \\nJava: time java Dummy echo -e \\nGcj: time ./gcj clean: rm -f gccregular gccsmart gccboost clangregular clangsmart clangboost gcj *.exe *.class
Comment
-
Originally posted by Ansla View Postsmart_pointers.java
Code:import java.util.*; class Item { int x; int y; }; class Dummy { static Vector<Item> items = new Vector<Item>(); static void ProcessItem(Item item) { } public static void main(String[] args) { for(int i = 0; i < 10; i++) { items.add(new Item()); } for(int i = 0; i < 100000000; i++) for(Item item : items) ProcessItem(item); } }
I change the program lightly, to iterate over an array of Item, like following:
Code:package run; (...) static Item[] items = new Item[10]; (...) for(int i = 0; i < 10; i++) { items[i] =new Item(); } (...)
Code:Time: 727 ms
Code:Time: 55430 ms
Last edited by ciplogic; 28 March 2012, 10:50 AM.
Comment
-
Originally posted by ciplogic View PostYes, I do think the same. ForEach loops in Java and C# will get an iterator. Java will do type checks (as Generics are implemented with Type Erasure), so it will check every time if Object that is stored in items collection is really an Item. Is like you will make an array of void* in C++, and do a dynamic cast at every loop step, from void* to Item, and this pointer you will give it to the Process method. The dynamic_cast may be the most expensive operation in your loop, not the operations of the loop.
I change the program lightly, to iterate over an array of Item, like following:
Code:package run; (...) static Item[] items = new Item[10]; (...) for(int i = 0; i < 10; i++) { items[i] =new Item(); } (...)
Code:Time: 727 ms
Originally posted by ciplogic View PostAlso, if you would play around, and every time you will get 0 ms, means that you hit a compiler optimization (either from Java or C++ side). In my original code, if you removed the +2 (in the process method) the Java optimizer would optimize it out, so you will get like 4 to 10 ms times, but I wanted to be as objective as possible.
Comment
-
Originally posted by Ansla View PostI replaced Vector with ArrayList and it reduced the time for Icedtea to 3.17 seconds and gcj to 37.86 seconds. So it's just the Vector in Java that's slow as hell.
That's why I used __attribute__ ((noinline)) in the C++ code. Do you know if there is an equivalent for Java/C#? Or are they just too high level to allow such details to be tweaked?
Also Vector is a synchronized version of the "std::vector" when ArrayList is not synchronized. In the link I gave it to you also states that basic arrays are the fastest construct (the dynamic cast check would not be here).
Java supports escape analysis, but have to be enabled as flags, and sometimes it may convert the Vector to ArrayList, if it match the optimization. You have to add to Java's arguments the following: -server -XX:+DoEscapeAnalysisLast edited by ciplogic; 28 March 2012, 11:18 AM.
Comment
-
Making the same change:
(...)
static Item[] items = new Item[10];
(...)
for(int i = 0; i < 10; i++) {
items[i] =new Item();
}
(...)
if you don't want to change in two places and you want that your program to store List<Item>, the "hot" loop have to be rewritten like following:
Code:var itemArray = items.ToArray(); for (var i = 0; i < 10000000; i++) foreach (var item in itemArray) ProcessItem(item);
If you will revert the loops (making the iterating over array the outer loop:
Code:var itemArray = items.ToArray(); foreach (var item in itemArray) for (var i = 0; i < 10000000; i++) ProcessItem(item);
For reference your code with C# original time: 14040 ms.
Comment
-
Originally posted by ciplogic View PostMaking the same change:
in C# will improve the performance around 2.5x.
if you don't want to change in two places and you want that your program to store List<Item>, the "hot" loop have to be rewritten like following:
Code:var itemArray = items.ToArray(); for (var i = 0; i < 10000000; i++) foreach (var item in itemArray) ProcessItem(item);
If you will revert the loops (making the iterating over array the outer loop:
Code:var itemArray = items.ToArray(); foreach (var item in itemArray) for (var i = 0; i < 10000000; i++) ProcessItem(item);
For reference your code with C# original time: 14040 ms.
Anyway, smart pointers are no different then any other class that a programmer used to a VM language might be tempted to pass as value (or return as value) instead of const reference when writing C++ code for the first time. The worst case is when that class is a collection, imagine code like this:
Code:std::vector<Item> GetItems() { return m_vector; }
Comment
-
Originally posted by Ansla View PostSince iterating over a collection has such a big impact on VM based languages the better solution for measuring strictly the cost of making a copy of a smart pointer is using code with no collections involved, I only included a collection in my tests because that was to use case you suggested initially.
Anyway, smart pointers are no different then any other class that a programmer used to a VM language might be tempted to pass as value (or return as value) instead of const reference when writing C++ code for the first time. The worst case is when that class is a collection, imagine code like this:
Code:std::vector<Item> GetItems() { return m_vector; }
Also, as you noticed, by mistake (I'm not blaming to make it) you did make your code inefficient as using a bad construct of Java (and C#) as many programmers may do, and get the wrong conclusion. C# can be used to make snappy application (or at least snappy enough) and if is not fast enough, in many cases is not the fault of C# or Java. Sometimes it is, but in most cases, people, as you wrote, do not know the performance implications of their constructs.
As being high level, C# brings some features that C++ could simulate them too, but will be harder to make it right natively (like dependency injection, parallel loops, or maybe runtime code generation and executing it via Reflection, in C++ you would invoke an external compiler, make a DLL and load it with LoadLibrary/dlload) which should be credited for. And people just programming using "older" language will miss the abstractions. How useful is it to have a JIT? For 95% of applications, it should not matter, but for having shaders using LLVM (as OS X does), is nice to have.
I think that you would use Mono if you would have performance in range of GCC (if you use a lot of classes/pointers, instead of using shared_ptr and going to put everywhere where compiler does not complain a const &, which is at least tedious) and you know that Mono or .Net is installed on the target machine, isn't so?
Note: about me suggesting that you should need to use a collection, I don't know where, please quote me to see as I don't know where I've told that.Last edited by ciplogic; 28 March 2012, 01:18 PM.
Comment
-
Originally posted by ciplogic View PostAs you understood my answer, I agree. I mean, when people judge Java/Mono as being slow, they don't do it with data. I don't want to prove to you that C# is faster than C++. Also, I had (somewhat) bad experience with performance implication of boost's smartpointer, which I think it should be considered. When references are in fact pointers in Java/C# (with an integer as "vtable", so they have memory overhead, but no performance overhead), the smart-pointers have performance overhead (as they "need" to update with increments/decrements for every time you use a pointer from an external component, when when you get a Java Object, or C# Object, you don't need to). Also smartpointers need to have extra thought process (as you need to use weak-references to break circular dependencies).
Originally posted by ciplogic View PostAlso, as you noticed, by mistake (I'm not blaming to make it) you did make your code inefficient as using a bad construct of Java (and C#) as many programmers may do, and get the wrong conclusion. C# can be used to make snappy application (or at least snappy enough) and if is not fast enough, in many cases is not the fault of C# or Java. Sometimes it is, but in most cases, people, as you wrote, do not know the performance implications of their constructs.
Originally posted by ciplogic View PostAs being high level, C# brings some features that C++ could simulate them too, but will be harder to make it right natively (like dependency injection, parallel loops, or maybe runtime code generation and executing it via Reflection, in C++ you would invoke an external compiler, make a DLL and load it with LoadLibrary/dlload) which should be credited for. And people just programming using "older" language will miss the abstractions. How useful is it to have a JIT? For 95% of applications, it should not matter, but for having shaders using LLVM (as OS X does), is nice to have.
I think that you would use Mono if you would have performance in range of GCC (if you use a lot of classes/pointers, instead of using shared_ptr and going to put everywhere where compiler does not complain a const &, which is at least tedious) and you know that Mono or .Net is installed on the target machine, isn't so?
Originally posted by ciplogic View PostNote: about me suggesting that you should need to use a collection, I don't know where, please quote me to see as I don't know where I've told that.
Originally posted by ciplogic View PostI know that this code can have multiple forms, and I know that it can be changed using a const reference, but the idea remains the same (if in the loop it is assigned a smart_ptr to another).
Code:vector<smart_ptr<Item>> items; for(auto it = items.begin(); it!=items.end(); ++it) { ProcessItem(*it); //increment and decrement of reference }
Comment
-
Originally posted by Ansla View Postshared_ptr is the most complex smart pointer, so probably the one with the highest overhead, that I used only on rare occasions, most of the time when the algorithm is simple enough it just has to know whether it passed ownership to some other component or it has to free it. And in this cases auto_ptr/unique_ptr are better suited.
(...)
You didn't exactly suggest I should use it, but my experiment started from the following code you posted and I tried to keep as close to it as possible:
(...)
I wanted to see just how much overhead copying the smart pointer could have, as I never encountered this problem in a real project, and I assumed it would have been negligible compared to the rest of the stuff going on in a real application, but I wanted to make sure. I still think the smart pointer is the least of your worry here, unless that code happens to be a really hot path, an even doing a postfix increment on the iterator could do more damage if the compiler doesn't optimize the temporary iterator away.
So Java was a bit better (always at least by adding the better development experience). The C++ codebase (if matters) was "very template based" so was "performance driven", but very hard to debug (templates errors expanding on a page were not uncommon, because the developer forgot a column).
Later I worked in C# world, and I found C# to be a bit slower, but fast enough (slower both than Java and C++), but very fast to integrate with C codebases, and friendlier from a C++ standpoint (a developer which I was for 4-5 years). And sticking that Mono is slow (that someone may quote it for it) it misses the point in my view (either as GC or as generated code), but its productivity that people when attack it, will lose all that comes down to it. And I know where C# can go slow (invisible boxing, using strings, using Collections.Generics everywhere, a place where STL is also not so fantastic). Yet in both Linux (using parameters for Mono) or Windows (using SharpDevelop) you have a very easy way to profile (and works better than their C++ Valgrind/KCacheGrind equivalent) so you can find where is really "too slow", and it can be optimized.
I think that KDE may not need Mono (I think that Qt is a close replacement, QObject + Moc + QML is fantastic in my opinion), but Gnome's platform is not so fortunate. Vala helps a little, but as a streamlined workflow, Monodevelop + Gtk# + C# is in my opinion (after maybe Python) the best way to develop on Gtk+. People shunning Mono, shuns an option for developers that they may really use.
Comment
Comment