Announcement

Collapse
No announcement yet.

Mono 2.11 Release Brings Many Changes

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    So?

    Originally posted by XorEaxEax View Post
    That doesn't make anymore it 'useless' than tons of other lanugages/frameworks out there, also 'need' is subjective in this context. That said Mono is certainly technology which have had a hard time finding it's niche, IIRC Novell wanted to compete with Red Hat by being the Microsoft 'approved' Linux distribution and thereby got a patent protection agreement from them which only covered Novell users. Novell had purchased Ximian which developed Mono, an open source implementation of C#/NET which Novell obviously hoped with give them an advantage over Red Hat in the enterprise sector. However this failed as the enterprise was not interested in Mono and neither was the Linux desktop as history shows.

    Now it seems Miguel and his devs are doing fine having found a niche for Mono with their MonoTouch/MonoDroid application frameworks. I'm happy for this as even though I have no personal interest in Mono there certainly are those out there who does rely on it and if the devs can make money out of Mono it means they can continue to develop Mono as an open source free language.

    Yes , as you pointed out, Mono is needed for some developers to make some money,just like in any other business model when you try to sell a product to make money, so in that context is usefull and needed, that's fine and nobody can't argue against that. But what you said only explains why this tool is still used, which leads me to the conclusion that Mono is just yet another scam.

    As you said, Mono failed to find its usefullnes as a tool.

    Comment


    • #32
      Originally posted by Alex Sarmiento View Post
      As you said, Mono failed to find its usefullnes as a tool.
      Nonsense, as long as it is being actively used it's still useful as a tool which it obviously is, else Xamarin (Miguel and the other devs from back at Ximian) would have gone bankrupt by now. I'm not interested in developing games with MonoTouch/MonoDroid but obviously others are. I'm also not interested in writing code in C# but obviously there are those who does. The (hopefully) continued success of MonoTouch/Droid will enable Xamarin to continue develop Mono which is thus yet another 'weapon' in the great open source arsenal of languages/frameworks. Personally I could be perfectly happy with just having C/C++/Python available, lots of other people would likely take poison rather than be confined to my favourite languages, they are no more right or wrong in their preferences than I am.

      Choice is good, alternatives are good, competition is good.

      Comment


      • #33
        Some of you seem to think that garbage collection is a feature unique to VM based languages (like Java and C#), that's not true, you can use a GC with C/C++ as well, see http://www.iecc.com/gclist/GC-faq.ht...C,%20and%20C++ (yes, it's the same GC used by mono).

        Another misconception about C++ specifically is that you have to mix memory management with the algorithm itself. This is true for C (unless you use the GC mentioned above), but C++ allows for RAII. That means well written C++ code will not have random deletes scattered around the code, all of them should be in destructors. While this implies writing some extra helper classes, with C++11 or legacy C++ and boost smart pointers usually do the job. And the best part is that a regular C# or Java developer could probably do a half good job of properly using smart pointers in C++.

        Each approach, GC and RAII has it's own advantages and disadvantages, so I would count that as an advantage for C++ that allows the use of any of them depending on your needs.

        There is a big advantage of VM based languages, and that is reflection, and in general better mechanisms for RTTI and exceptions. So I think a VM based language that allows using RAII instead of forcing the use a GC would be a really useful tool, though I'm not aware of any such language so I'm sticking with C++ for now.

        Comment


        • #34
          Originally posted by Ansla View Post
          Some of you seem to think that garbage collection is a feature unique to VM based languages (like Java and C#), that's not true, you can use a GC with C/C++ as well, see http://www.iecc.com/gclist/GC-faq.ht...C,%20and%20C++ (yes, it's the same GC used by mono).

          Another misconception about C++ specifically is that you have to mix memory management with the algorithm itself. This is true for C (unless you use the GC mentioned above), but C++ allows for RAII. That means well written C++ code will not have random deletes scattered around the code, all of them should be in destructors. While this implies writing some extra helper classes, with C++11 or legacy C++ and boost smart pointers usually do the job. And the best part is that a regular C# or Java developer could probably do a half good job of properly using smart pointers in C++.
          Mono does use both Boehm and a generational garbage collector. GC can be done in C/C++ as is a proof that the VMs implement it in C/C++. But is a by-default feature, well backed functionality of the VMs.
          SmartPointers for example are slow, if you do a for loop/iterate you will have a lot of reference increment/decrement, in comparison with a GC that does have virtually no overhead. Another problem that is solved well by a GC is when you don't know the ownership of the objects, meaning you don't know if you can delete an object (because may be or may not be yours). Let's take a Pipes&Filters pattern, and you have a chain of components in your program, and you know that the first component of the chain have to get some and send it forward to the next component of the program. The programmer of the next component will get the ownership of the component, but if the programmer is lazy (makes a mistake), it may not free the reference to this object and it will mem-leak. A GC does these algorithms to be natural. There are solutions for this problem, like use global memory pools, so you can at least detect leaks and remove them periodically, but again is the burden of the C++ developer to do it, when in Java/C# this is natural.
          At the end the Boost's smart_ptr is mostly the default used template to work with smart pointers, which is easy to use and thread-safe. I agree that most developers should get used to use smart-pointers as a default practice and will be close-to C#/Java experience in most cases.

          Comment


          • #35
            Originally posted by ciplogic View Post
            Mono does use both Boehm and a generational garbage collector.
            Boehm is also generational.

            Originally posted by ciplogic View Post
            GC can be done in C/C++ as is a proof that the VMs implement it in C/C++. But is a by-default feature, well backed functionality of the VMs.
            Not sure what you wanted to say with this, but if you mean the VM language compilers help the GC with information on memory layout, then indeed, gcc is not that well integrated with the GC so you have to provide hints yourself or the GC will perform poorly.

            Originally posted by ciplogic View Post
            SmartPointers for example are slow, if you do a for loop/iterate you will have a lot of reference increment/decrement, in comparison with a GC that does have virtually no overhead.
            You mean making copies to a pointer inside a loop? That would indeed introduce significant overhead, but it's a corner case that can usually be avoided. So you could say that the overhead is proportional to the number of times you copy or pass around a pointer in the case of smart pointers while for the GC it's proportional with the amount of memory allocated for a GC (or the amount of memory recently allocated for a generational GC). The problem with either type of GC is that all this overhead is concentrated in big spikes, so even if the average overhead is smaller than that of smart pointers those spikes make it unusable for real-time processing for instance.

            Originally posted by ciplogic View Post
            Another problem that is solved well by a GC is when you don't know the ownership of the objects, meaning you don't know if you can delete an object (because may be or may not be yours).
            That is exactly what smart pointers solves, so I'm not sure what your point was.

            Comment


            • #36
              Originally posted by Ansla View Post
              You mean making copies to a pointer inside a loop? That would indeed introduce significant overhead, but it's a corner case that can usually be avoided. So you could say that the overhead is proportional to the number of times you copy or pass around a pointer in the case of smart pointers while for the GC it's proportional with the amount of memory allocated for a GC (or the amount of memory recently allocated for a generational GC). The problem with either type of GC is that all this overhead is concentrated in big spikes, so even if the average overhead is smaller than that of smart pointers those spikes make it unusable for real-time processing for instance.


              I know that this code can have multiple forms, and I know that it can be changed using a const reference, but the idea remains the same (if in the loop it is assigned a smart_ptr to another).
              Code:
              vector<smart_ptr<Item>> items;
              for(auto it = items.begin(); it!=items.end(); ++it)
              {
                  ProcessItem(*it); //increment and decrement of reference
              }
              In practice, a code like this was the second time consumer after memory allocator in one application I worked on (around 4 years in the past). And this code was not written bad by default, but because you did not know what or when the object would be taken, most people would not care to use constant reference or even using the pointer itself.
              If you look for Tamarin's Virtual Machine design talk @Standford, the Tamarin VM did use reference counting and a cycle collector for circular references. Anyway, for tight loops where a reference did not work directly with other reference, they used "deferred reference counting", meaning that until the one data did not change other's data reference counting.
              This is why, Wikipedia's page have a section about it, inside reference counting page.
              Originally posted by Ansla View Post
              <<When you don't know the ownership>>
              That is exactly what smart pointers solves, so I'm not sure what your point was.
              In Pipes&Filters, sometimes you don't know who has to delete the object, but also you can create (without wanting) cycles over the chain of the program, and you don't know who has to delete what. This is why people should use weak references.

              Comment


              • #37
                Originally posted by Ansla View Post
                The problem with either type of GC is that all this overhead is concentrated in big spikes, so even if the average overhead is smaller than that of smart pointers those spikes make it unusable for real-time processing for instance.
                That's what really rankles me about the triumphalism that sometimes accompanies discussions of GC. "Real-time" is not some exotic thing anymore, if it ever was. Between GUI "snappiness", steady frame rates in video games, lack of stuttering in media players, lack of screwiness in VoIP service, et. al., the theoretical space of "real-time" now effectively covers a considerable range of real-world requirements demanded by users on a daily basis.

                That's not to say that GC isn't tremendously useful when applied appropriately. But -- much like OOP in the early 90s -- there are a lot of people acting like it's eminently superior 99.999% of the time instead of (say) 70% of the time, and bad things happen when that attitude informs platform specifications.

                Comment


                • #38
                  Originally posted by Ex-Cyber View Post
                  That's what really rankles me about the triumphalism that sometimes accompanies discussions of GC. "Real-time" is not some exotic thing anymore, if it ever was. Between GUI "snappiness", steady frame rates in video games, lack of stuttering in media players, lack of screwiness in VoIP service, et. al., the theoretical space of "real-time" now effectively covers a considerable range of real-world requirements demanded by users on a daily basis.

                  That's not to say that GC isn't tremendously useful when applied appropriately. But -- much like OOP in the early 90s -- there are a lot of people acting like it's eminently superior 99.999% of the time instead of (say) 70% of the time, and bad things happen when that attitude informs platform specifications.
                  I agree. i think anyone in Java/C# world would agree too. Programs that don't handle GC well they have to be written in a language that can circumvent it. But this is also rarely the case for 70% of applications, and when people do complain here about Mono, they make the 30% to appear to be the majority. About games, media players and a VoIP service, there are games that use a GC based language without problems. I mean that out of 30% applications, taking care of the GC circumventing the big pauses, the real games can work with it (not being easier, but it is possible in most cases to do it).
                  The biggest slowness in most Java/C# games/applications as far as I was working with them, is not the GC but the JIT. And sometimes, triggering a much powerful JIT (that computes more optimizations) will mean more startup pauses. This article says at the end about this.

                  Conclusions: Why is "Java is Slow" so Popular?
                  (...)
                  Java program startup is slow. As a java program starts, it unzips the java libraries and compiles parts of itself, so an interactive program can be sluggish for the first couple seconds of use.

                  This approaches being a reasonable explanation for the speed myth. But while it might explain user's impressions, it does not explain why many programmers (who can easily understand the idea of an interpreted program being compiled) share the belief.

                  Two of the most interesting observations regarding this issue are that:

                  there is a similar "garbage collection is slow" myth that persists despite decades of evidence to the contrary, and
                  that in web flame wars, people are happy to discuss their speed impressions for many pages without ever referring to actual data.
                  (this is why I'm writing most of the times here, I'll be glad if users would really test Mono, Java, Ruby and if fit their needs, to use it. If doesn't, dump it, but don't rant about how slow are things without working with them and be realistical)
                  There are papers that do appear to say the same. This one, says that a full register allocator, would be the biggest part of the Java's Server compiler, and even a very fast Linear Scan Register Allocator it would be 20-30% of compiling time. Mono uses the same "linear scan" allocator. This means in "lot of variables" computational code that it would perform bad (visibly worse than a C/C++ allocator) compared with a C++ code (like 15-30% slower). But the part that you did a click (the first time), and you wait for you application to JIT, is annoying (for some) and appear to be slow.
                  I was using NetBeans somewhat often, and when a GC pause of 1 second appears, it would free like 200-300 MB of memory at once. If most applications would have let's say 15 MB to release as garbage, this would mean like 50 ms to free, which is much more manageable for most users (I don't know a VoIP client which doesn't offer a 200 ms sound buffer range, so a GC pause would be unnoticeable)
                  There are solutions for slowness of JIT (if you use .Net, all the assemblies in the shared cache are precompiled using NGen tool) likely to compile them offline. If Mono will fix the LLVM code generator to create AOT images on Linux (it didn't work at least with 2.10) it can create "faster than life" assemblies (as it would be backed by a C++ class compiler) and will not be slow at startup.
                  Last edited by ciplogic; 03-28-2012, 04:05 AM.

                  Comment


                  • #39
                    Well, I did some micro-benchmarks with the corner-case suggested by ciplogic and this are the results:

                    gcc with regular pointers
                    0.00user 0.00system 0:00.00elapsed 0%CPU (0avgtext+0avgdata 4384maxresident)k
                    0inputs+0outputs (0major+338minor)pagefaults 0swaps

                    gcc with boehm-gc
                    GC Warning: USE_PROC_FOR_LIBRARIES + GC_LINUX_THREADS performs poorly.
                    0.00user 0.00system 0:00.00elapsed 0%CPU (0avgtext+0avgdata 6192maxresident)k
                    0inputs+0outputs (0major+484minor)pagefaults 0swaps

                    gcc with native smart pointers
                    2.73user 0.00system 0:02.74elapsed 99%CPU (0avgtext+0avgdata 4352maxresident)k
                    0inputs+0outputs (0major+335minor)pagefaults 0swaps

                    gcc with boost smart pointers
                    19.03user 0.00system 0:19.04elapsed 99%CPU (0avgtext+0avgdata 4336maxresident)k
                    0inputs+0outputs (0major+334minor)pagefaults 0swaps

                    clang with regular pointers
                    0.00user 0.00system 0:00.00elapsed 0%CPU (0avgtext+0avgdata 4368maxresident)k
                    0inputs+0outputs (0major+337minor)pagefaults 0swaps

                    clang with boehm-gc
                    GC Warning: USE_PROC_FOR_LIBRARIES + GC_LINUX_THREADS performs poorly.
                    0.00user 0.00system 0:00.00elapsed 0%CPU (0avgtext+0avgdata 6176maxresident)k
                    0inputs+0outputs (0major+483minor)pagefaults 0swaps

                    clang with boost smart pointers
                    19.07user 0.00system 0:19.08elapsed 99%CPU (0avgtext+0avgdata 4352maxresident)k
                    0inputs+0outputs (0major+335minor)pagefaults 0swaps

                    Mono:
                    14.35user 0.00system 0:14.36elapsed 99%CPU (0avgtext+0avgdata 17360maxresident)k
                    0inputs+0outputs (0major+1226minor)pagefaults 0swaps

                    Java:
                    41.29user 0.09system 0:41.29elapsed 100%CPU (0avgtext+0avgdata 124352maxresident)k
                    0inputs+0outputs (0major+10177minor)pagefaults 0swaps

                    Gcj:
                    86.23user 0.04system 1:26.29elapsed 99%CPU (0avgtext+0avgdata 161680maxresident)k
                    0inputs+0outputs (0major+10278minor)pagefaults 0swaps
                    First of all, the only test with clang that I was really interested in, the one with native smart pointers could not be executed, because c++0x support in clang is broken. It fails to parse the STL header files when given the -std=c++0x argument. I found on the net that this is known issue with clang 2.9 and it's supposedly resolved with clang 3.0, but I have 3.0 installed and it still doesn't work.

                    So now for the cases that did work, the test consists in calling a function that does nothing with a pointer as parameter. The main target of this benchmark was C++ with it's smart pointers, but i also added boehm-gc, mono and Java for reference.

                    Now, the obvious result is that Java is really slow! No excuses like long time spent unpacking archives or other bullshit, both the Icedtea and gcj implementations of Java are plain ridiculously slow. Mono actually does a decent job at this being faster then C++ with boost::shared_ptr, but still it doesn't justify switching to Mono if C++11 std::shared_pointer or boehm-gc are valid alternatives. And this was an ideal test for garbage collectors as there was basically no memory to collect for the entire test, it just measured how well it stays out of the way when it's not needed.

                    If anybody is interested in the sources in order to replicate the results (maybe get clang c++0x working?) I will post them in another comment, this one is getting too big already.

                    Comment


                    • #40
                      Java slow?

                      I made a similar code in Java and C++ (I will do C# if needed)

                      C++ code: //I used VS 2010, release

                      Code:
                      #include <stdio.h>
                      #include <memory>
                      #include <Windows.h>
                      
                      int addDummy(std::shared_ptr<int> data)
                      {
                      	int value =*data+2;
                      	return value;
                      }
                      
                      int main()
                      {
                      	std::shared_ptr<int> data(new int (2));
                      		int iterations = 1000*1000*1000;
                      
                      	
                      	long begin = GetTickCount();
                      	int sum = 0;
                      	for(int i = 0; i <iterations;i++)
                      		sum+=addDummy(data);
                      		
                      	long end =GetTickCount();
                      	printf("Time: %d %d ms\n", (end-begin), sum);
                      	system("pause");
                      	return 0;
                      }
                      Output:
                      Time: 20685 -294967296 ms
                      Java code:
                      Code:
                      package run;
                      
                      public class Start {
                      
                      	public static void main(String[] args) {
                      		Integer value = new Integer(2);
                      		int iterations = 1000*1000*1000;
                      		long start = System.currentTimeMillis();
                      		int sum =0;
                      		// TODO Auto-generated method stub
                      		for(int i = 0; i <iterations;i++)
                      		{
                      			sum += process(value);
                      		}
                      		long end = System.currentTimeMillis();
                      		System.out.println("Time: "+(end-start) + " " + sum);
                      	}
                      
                      	private static int process(Integer value) {
                      		int test = value.intValue() + 2;
                      		return test;
                      	}
                      
                      }
                      Output:
                      Code:
                      Time: 114 -294967296
                      As you noticed the Java code does solve the same computation and uses Integer value (which boxes an int "primitive type") in 114 ms against VS 2010 release mode C++ which needed 20685 ms. Changing the line:

                      Code:
                      int addDummy(std::shared_ptr<int> data)
                      to

                      Code:
                      int addDummy(const std::shared_ptr<int>& data)
                      will give for C++ the 0 ms output (which is really great that the C++ optimizer can solve it with no computation at runtime).
                      Last edited by ciplogic; 03-28-2012, 08:28 AM.

                      Comment


                      • #41
                        Originally posted by Ansla View Post
                        If anybody is interested in the sources in order to replicate the results (maybe get clang c++0x working?) I will post them in another comment, this one is getting too big already.
                        Please do. A pastebin link would be perfect.

                        Comment


                        • #42
                          Originally posted by ciplogic View Post
                          Java slow?

                          I made a similar code in Java and C++ (I will do C# if needed)

                          As you noticed the Java code does solve the same computation and uses Integer value (which boxes an int "primitive type") in 114 ms against VS 2010 release mode C++ which needed 20685 ms. Changing the line:
                          Well, this is my code that generated those results:

                          smart_pointers.cpp
                          Code:
                          #include <vector>
                          #include <memory>
                          #include <boost/smart_ptr.hpp>
                          
                          #ifndef POINTER
                          #define POINTER Item*
                          //#define POINTER std::shared_ptr<Item>
                          //#define POINTER boost::shared_ptr<Item>
                          #endif
                          
                          struct Item {
                                  int x;
                                  int y;
                          };
                          
                          std::vector<POINTER> items;
                          
                          __attribute__ ((noinline)) void ProcessItem(POINTER item) {}
                          
                          int main() {
                                  for(int i = 0; i < 10; i++) {
                                          POINTER x(new Item);
                                          items.push_back(x);
                                  }
                                  for(int i = 0; i < 100000000; i++)
                                          for(auto it = items.begin(); it != items.end(); ++it)
                                                  ProcessItem(*it);
                                  return 0;
                          }
                          smart_pointers.cs
                          Code:
                          using System.Collections.Generic;
                          
                          class Item {
                                  int x;
                                  int y;
                          };
                          
                          class Dummy {
                                  static List<Item> items = new List<Item>();
                          
                                  static void ProcessItem(Item item) {
                                  }
                          
                                  static int Main(string[] args) {
                                          for(int i = 0; i < 10; i++) {
                                                  items.Add(new Item());
                                          }
                                          for(int i = 0; i < 100000000; i++)
                                                  foreach(Item item in items)
                                                          ProcessItem(item);
                                          return 0;
                                  }
                          }
                          smart_pointers.java
                          Code:
                          import java.util.*;
                          
                          class Item {
                                  int x;
                                  int y;
                          };
                          
                          class Dummy {
                                  static Vector<Item> items = new Vector<Item>();
                          
                                  static void ProcessItem(Item item) {
                                  }
                          
                                  public static void main(String[] args) {
                                          for(int i = 0; i < 10; i++) {
                                                  items.add(new Item());
                                          }
                                          for(int i = 0; i < 100000000; i++)
                                                  for(Item item : items)
                                                          ProcessItem(item);
                                  }
                          }
                          Makefile
                          Code:
                          CXXFLAGS=-std=c++0x -O2
                          #clang chokes on STL headers with c++0x
                          CLANG_EXTRA_FLAGS=-std=c++98
                          LDFLAGS=-lboost_system
                          
                          all: execute
                          
                          gccregular: smart_pointers.cpp
                                  g++ -o $@ $(CXXFLAGS) $(LDFLAGS) $^
                          
                          gccsmart: smart_pointers.cpp
                                  g++ -o $@ -DPOINTER=std::shared_ptr\<Item\> $(CXXFLAGS) $(LDFLAGS) $^
                          
                          gccboost: smart_pointers.cpp
                                  g++ -o $@ -DPOINTER=boost::shared_ptr\<Item\> $(CXXFLAGS) $(LDFLAGS) $^
                          
                          clangregular: smart_pointers.cpp
                                  clang++ -o $@ $(CXXFLAGS) $(CLANG_EXTRA_FLAGS) $(LDFLAGS) $^
                          
                          clangsmart: smart_pointers.cpp
                                  clang++ -o $@ -DPOINTER=std::shared_ptr\<Item\> $(CXXFLAGS) $(CLANG_EXTRA_FLAGS) $(LDFLAGS) $^
                          
                          clangboost: smart_pointers.cpp
                                  clang++ -o $@ -DPOINTER=boost::shared_ptr\<Item\> $(CXXFLAGS) $(CLANG_EXTRA_FLAGS) $(LDFLAGS) $^
                          
                          smart_pointers.exe: smart_pointers.cs
                                  mcs $^
                          
                          Dummy.class: smart_pointers.java
                                  javac $^
                          
                          gcj: smart_pointers.java
                                  gcj -o $@ --main=Dummy $^
                          
                          execute: gccregular gccsmart gccboost clangregular clangboost smart_pointers.exe Dummy.class gcj
                                  echo -e \\ngcc with regular pointers
                                  time ./gccregular
                                  echo -e \\ngcc with boehm-gc
                                  LD_PRELOAD=/usr/lib/libgc.so time ./gccregular
                                  echo -e \\ngcc with native smart pointers
                                  time ./gccsmart
                                  echo -e \\ngcc with boost smart pointers
                                  time ./gccboost
                                  echo -e \\nclang with regular pointers
                                  time ./clangregular
                                  echo -e \\nclang with boehm-gc
                                  LD_PRELOAD=/usr/lib/libgc.so time ./clangregular
                                  #echo -e \\nclang with native smart pointers
                                  #time ./clangsmart
                                  echo -e \\nclang with boost smart pointers
                                  time ./clangboost
                                  echo -e \\nMono:
                                  time mono smart_pointers.exe
                                  echo -e \\nJava:
                                  time java Dummy
                                  echo -e \\nGcj:
                                  time ./gcj
                          
                          clean:
                                  rm -f gccregular gccsmart gccboost clangregular clangsmart clangboost gcj *.exe *.class
                          The main difference I see is that you got rid of the vector in the test you performed, so I suppose it's iterating over the vector that kills java performance in my case.

                          Comment


                          • #43
                            Originally posted by Ansla View Post
                            smart_pointers.java
                            Code:
                            import java.util.*;
                            
                            class Item {
                                    int x;
                                    int y;
                            };
                            
                            class Dummy {
                                    static Vector<Item> items = new Vector<Item>();
                            
                                    static void ProcessItem(Item item) {
                                    }
                            
                                    public static void main(String[] args) {
                                            for(int i = 0; i < 10; i++) {
                                                    items.add(new Item());
                                            }
                                            for(int i = 0; i < 100000000; i++)
                                                    for(Item item : items)
                                                            ProcessItem(item);
                                    }
                            }
                            The main difference I see is that you got rid of the vector in the test you performed, so I suppose it's iterating over the vector that kills java performance in my case.
                            Yes, I do think the same. ForEach loops in Java and C# will get an iterator. Java will do type checks (as Generics are implemented with Type Erasure), so it will check every time if Object that is stored in items collection is really an Item. Is like you will make an array of void* in C++, and do a dynamic cast at every loop step, from void* to Item, and this pointer you will give it to the Process method. The dynamic_cast may be the most expensive operation in your loop, not the operations of the loop. C# generics do not lose the types so it doesn't need to do the check of the iterator at every step, so this check is not a part of the C#/C++ generics/templates code.

                            I change the program lightly, to iterate over an array of Item, like following:
                            Code:
                            package run;
                            (...)
                                    static Item[] items = new Item[10];
                            (...)
                                            for(int i = 0; i < 10; i++) {
                                                    items[i] =new Item();
                                            }
                            (...)
                            And the output is:
                            Code:
                            Time: 727 ms
                            On my machine the original time was:
                            Code:
                            Time: 55430 ms
                            Also, if you would play around, and every time you will get 0 ms, means that you hit a compiler optimization (either from Java or C++ side). In my original code, if you removed the +2 (in the process method) the Java optimizer would optimize it out, so you will get like 4 to 10 ms times, but I wanted to be as objective as possible.
                            Last edited by ciplogic; 03-28-2012, 10:50 AM.

                            Comment


                            • #44
                              Originally posted by ciplogic View Post
                              Yes, I do think the same. ForEach loops in Java and C# will get an iterator. Java will do type checks (as Generics are implemented with Type Erasure), so it will check every time if Object that is stored in items collection is really an Item. Is like you will make an array of void* in C++, and do a dynamic cast at every loop step, from void* to Item, and this pointer you will give it to the Process method. The dynamic_cast may be the most expensive operation in your loop, not the operations of the loop.

                              I change the program lightly, to iterate over an array of Item, like following:
                              Code:
                              package run;
                              (...)
                                      static Item[] items = new Item[10];
                              (...)
                                              for(int i = 0; i < 10; i++) {
                                                      items[i] =new Item();
                                              }
                              (...)
                              And the output is:
                              Code:
                              Time: 727 ms
                              I replaced Vector with ArrayList and it reduced the time for Icedtea to 3.17 seconds and gcj to 37.86 seconds. So it's just the Vector in Java that's slow as hell.

                              Originally posted by ciplogic View Post
                              Also, if you would play around, and every time you will get 0 ms, means that you hit a compiler optimization (either from Java or C++ side). In my original code, if you removed the +2 (in the process method) the Java optimizer would optimize it out, so you will get like 4 to 10 ms times, but I wanted to be as objective as possible.
                              That's why I used __attribute__ ((noinline)) in the C++ code. Do you know if there is an equivalent for Java/C#? Or are they just too high level to allow such details to be tweaked?

                              Comment


                              • #45
                                Originally posted by Ansla View Post
                                I replaced Vector with ArrayList and it reduced the time for Icedtea to 3.17 seconds and gcj to 37.86 seconds. So it's just the Vector in Java that's slow as hell.


                                That's why I used __attribute__ ((noinline)) in the C++ code. Do you know if there is an equivalent for Java/C#? Or are they just too high level to allow such details to be tweaked?
                                Yes, in .Net 4.5/latest Mono, but are a fairly new development. As for Java, I don't think there is something like this.

                                Also Vector is a synchronized version of the "std::vector" when ArrayList is not synchronized. In the link I gave it to you also states that basic arrays are the fastest construct (the dynamic cast check would not be here).
                                Java supports escape analysis, but have to be enabled as flags, and sometimes it may convert the Vector to ArrayList, if it match the optimization. You have to add to Java's arguments the following: -server -XX:+DoEscapeAnalysis
                                Last edited by ciplogic; 03-28-2012, 11:18 AM.

                                Comment

                                Working...
                                X