Announcement

Collapse
No announcement yet.

Enlightenment 0.23 Released With Massive Wayland Improvements

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • raster
    replied
    Originally posted by Cape View Post

    Rasterman! Your jargon is music to my ears!
    For example: you are talking about objects but isn't E pure C? Do you have some sort of integrated GC in EFL?

    Sidenote:
    Did somebody ever tried E on Librem5 devkit or similar??
    OH ... I think I forgot to reply to this.

    I don't have the librem dev kit, so I can't say how it runs, but I do run E on my raspberry pi 3, raspberry pi 4 and rockpro64. All in all they are ballpark similar in performance and hardware - well compared to a PC (sure rpi3 has only 1gb ram, the librem5 has 3gb and the rockpro64 and rpi4 have 4gb, They have slightly differing core counts and arm core designs as well as different gpu's ...but the rpi3 will be the weakest I imagine, and E does run there).

    And yes E and EFL are pure C. That does not preclude objects at all. Objects are a concept, not a language feature. Some languages have explicit support for objects in the language itself, but you can have objects in C. One way is structs. You can have obj->show(obj); for example... we don't do that. That would require exposing the memory layout for a function table and make it actually hard to do inheritance and future expansion of classes so instead we have efl_show(obj); and that then dispatches and CALLS the chow method for that object class type for example. It's all a lot of fun with function pointers. Give we can address any function by its name (address at runtime) we Can alter what function is called via the dispatcher and is then passed that object data - explicitly finding the offset from the start of the object memory address where the class data for that class is stored and then passes it to that function.

    Leave a comment:


  • Cape
    replied
    Originally posted by raster View Post

    E/EFL could be better. There are areas where we can do markedly better in terms of smoothness. I have plans/ideas. It's just there is a huge amount of already optimized infra there and some of it was built with synchronous assumptions long long long ago (e.g. you create an image object, then set the file to point it to... you get get the geometry of the image to decide how to size it then size the object in sequence: pseudo-code):

    Code:
    obj = image_add();
    file_set(obj, "/path/to/icon.png");
    size = size_get(obj);
    resize(obj, size.width, size.height);
    You get the idea. That is synchronous and depends on loading the png to get the size, so we may block (please keep reading for details on what that load involves). We have caches (speculative that keep data around after it's not used anymore and was already freed/deleted) to speed up re-loading the same thing, so these file_set's become NOPs there when we get cache hits, as we just dig the data out form memory we already have lurking around. We also de-duplicate on the fly (load the same image file in 20 objects we just point to the same image data/struct in the background and share it across all the image instances). The software renderer will also count the uses of scaling that icon to different sizes and if a certain destination scale size is used often enough it'll stop on-the-fly scaling and keep a scaled copy around to avoid the rescale-on-the-fly costs (GL will always scale on-the-fly). There are all sorts of other fun going on too that I could spend all day describing.

    You get the idea that there are multiple layers and ways we cache data and reduce overheads already, but that initial first-time-if-not-in-cache load means going to "disk" and waiting. We do split loading into "load header vs. load body", so this means we only open the file, find the header, get metadata like size, if it has alpha etc., then close it and avoid a full decode, but it's still a stall. We do have a "now preload the body data in the background async and tell me when it's done via an event callback" with a thread pool that goes and decodes the image data (the most expensive part of loading an image file), but that is explicit in higher level code (we do this for things like wallpapers, icons etc. but not everything) and you may notice sometimes icons "appear" or "zoom/fade in" later on. That's once the event of "we loaded the data now - you can show that image from now on as the data is ready" comes in. If something shows the image before this load has completed, the render pass stalls waiting on it to complete to ensure it has the data it needs. The object HAS to remain "logically hidden" to avoid that stall. Sometimes some code somewhere just decides to go show it anyway and you didn't realize it was happening, thus causing a stall. Still, that initial header load can hurt if your disk is slow and disk caches of the kernel don't have the data readily available. Also we don't always async load the data in a thread because we avoid ever decoding data if the image is never actually rendered by default (imagine you have 500 icons in a list and most are off-screen and not visible - why load all of them now when you can load them on demand as you scroll around and they are really needed for rendering? this is already implicit and the default). Forcing an async threaded load for everything all the time will mean in these cases we pay a decode price and memory for something that may never needed because you never scroll that far etc.

    My point in describing this is to show we have places we can improve on. We can decode that header async too. It's also explicit in higher level code to do this, in addition to requesting to load the body data async. (I have noticed that there seems to be a bug involved here that I have yet to find that makes the canvas think such images have no alpha channel... sometimes... need to find that some time). We could be a whole lot better and async decode EVERYTHING in threads and carefully pick policies on what is and is not decoded and when. There is a lot more besides we could spawn off into threads.

    Object construction actually can be quite costly as it not only allocates memory but does a bunch of setup (like the above file loads) and also... produces a lot of events which then cause event handler callbacks to be called that then react to those by modifying the object or something else etc.. Object destruction also can be costly for the same reasons. We can defer a lot of deletion of objects to idle time (we already defer by 2 render cycles for state comparison reasons for minimum update region calculation). We could spool off a queue of objects to delete whilst idle to avoid it impacting interactivity and keep the framerate snappier. We could add more higher level object caches that cache high level UI objects to cut the cost of creation down significantly thus making it a lot snappier too. Our Software renderer does all the hard work in a thread, but our GL renderer issues all the GL work in the main loop/thread and this can block - especially on getting buffer age and doing a swap, so moving this to threads would help. We're far from perfect. There is much to do. Sliding it into an already complex system is hard work - especially if you don't want to break anything. We've done the "inline assembly for routines that matter and can have this applied to well". Done if for x86 and ARM. It's the other things that still need work. We could move some data structs from fragmented linked lists to something more compact/array like for better CPU/memory cacheline niceness. We could drop our call overhead by doing fewer dispatches or doing some profiling/optimization on our call resolver/dispatcher. Already done some caching there too but more can be done.

    You may notice that a lot of the optimizing is all about really just: 1. caches (de-duplicating as well as speculative), 2. deferring work until later, 3. deferring work until idle, 4. punting work off into threads to move it out of the main loop. We're really good with #1 and #2. #3 and #4 are still a bit spotty. Add in some data struct work and we have a #5 to do too.
    Rasterman! Your jargon is music to my ears!
    For example: you are talking about objects but isn't E pure C? Do you have some sort of integrated GC in EFL?

    Sidenote:
    Did somebody ever tried E on Librem5 devkit or similar??

    Leave a comment:


  • raster
    replied
    Originally posted by Cape View Post
    My God... We should all ditch GTK/Gnome and start developing on Enlightenment...

    That thing FLIES on my potato netbook, whereas Gnome struggles even on my i7 😓

    Performance should be the driving force!!
    E/EFL could be better. There are areas where we can do markedly better in terms of smoothness. I have plans/ideas. It's just there is a huge amount of already optimized infra there and some of it was built with synchronous assumptions long long long ago (e.g. you create an image object, then set the file to point it to... you get get the geometry of the image to decide how to size it then size the object in sequence: pseudo-code):

    Code:
    obj = image_add();
    file_set(obj, "/path/to/icon.png");
    size = size_get(obj);
    resize(obj, size.width, size.height);
    You get the idea. That is synchronous and depends on loading the png to get the size, so we may block (please keep reading for details on what that load involves). We have caches (speculative that keep data around after it's not used anymore and was already freed/deleted) to speed up re-loading the same thing, so these file_set's become NOPs there when we get cache hits, as we just dig the data out form memory we already have lurking around. We also de-duplicate on the fly (load the same image file in 20 objects we just point to the same image data/struct in the background and share it across all the image instances). The software renderer will also count the uses of scaling that icon to different sizes and if a certain destination scale size is used often enough it'll stop on-the-fly scaling and keep a scaled copy around to avoid the rescale-on-the-fly costs (GL will always scale on-the-fly). There are all sorts of other fun going on too that I could spend all day describing.

    You get the idea that there are multiple layers and ways we cache data and reduce overheads already, but that initial first-time-if-not-in-cache load means going to "disk" and waiting. We do split loading into "load header vs. load body", so this means we only open the file, find the header, get metadata like size, if it has alpha etc., then close it and avoid a full decode, but it's still a stall. We do have a "now preload the body data in the background async and tell me when it's done via an event callback" with a thread pool that goes and decodes the image data (the most expensive part of loading an image file), but that is explicit in higher level code (we do this for things like wallpapers, icons etc. but not everything) and you may notice sometimes icons "appear" or "zoom/fade in" later on. That's once the event of "we loaded the data now - you can show that image from now on as the data is ready" comes in. If something shows the image before this load has completed, the render pass stalls waiting on it to complete to ensure it has the data it needs. The object HAS to remain "logically hidden" to avoid that stall. Sometimes some code somewhere just decides to go show it anyway and you didn't realize it was happening, thus causing a stall. Still, that initial header load can hurt if your disk is slow and disk caches of the kernel don't have the data readily available. Also we don't always async load the data in a thread because we avoid ever decoding data if the image is never actually rendered by default (imagine you have 500 icons in a list and most are off-screen and not visible - why load all of them now when you can load them on demand as you scroll around and they are really needed for rendering? this is already implicit and the default). Forcing an async threaded load for everything all the time will mean in these cases we pay a decode price and memory for something that may never needed because you never scroll that far etc.

    My point in describing this is to show we have places we can improve on. We can decode that header async too. It's also explicit in higher level code to do this, in addition to requesting to load the body data async. (I have noticed that there seems to be a bug involved here that I have yet to find that makes the canvas think such images have no alpha channel... sometimes... need to find that some time). We could be a whole lot better and async decode EVERYTHING in threads and carefully pick policies on what is and is not decoded and when. There is a lot more besides we could spawn off into threads.

    Object construction actually can be quite costly as it not only allocates memory but does a bunch of setup (like the above file loads) and also... produces a lot of events which then cause event handler callbacks to be called that then react to those by modifying the object or something else etc.. Object destruction also can be costly for the same reasons. We can defer a lot of deletion of objects to idle time (we already defer by 2 render cycles for state comparison reasons for minimum update region calculation). We could spool off a queue of objects to delete whilst idle to avoid it impacting interactivity and keep the framerate snappier. We could add more higher level object caches that cache high level UI objects to cut the cost of creation down significantly thus making it a lot snappier too. Our Software renderer does all the hard work in a thread, but our GL renderer issues all the GL work in the main loop/thread and this can block - especially on getting buffer age and doing a swap, so moving this to threads would help. We're far from perfect. There is much to do. Sliding it into an already complex system is hard work - especially if you don't want to break anything. We've done the "inline assembly for routines that matter and can have this applied to well". Done if for x86 and ARM. It's the other things that still need work. We could move some data structs from fragmented linked lists to something more compact/array like for better CPU/memory cacheline niceness. We could drop our call overhead by doing fewer dispatches or doing some profiling/optimization on our call resolver/dispatcher. Already done some caching there too but more can be done.

    You may notice that a lot of the optimizing is all about really just: 1. caches (de-duplicating as well as speculative), 2. deferring work until later, 3. deferring work until idle, 4. punting work off into threads to move it out of the main loop. We're really good with #1 and #2. #3 and #4 are still a bit spotty. Add in some data struct work and we have a #5 to do too.

    Leave a comment:


  • Marzal
    replied
    Michael
    It has been almost two years since the release of Enlightenment 0.23
    I think should be:
    It has been almost two years since the release of Enlightenment 0.22

    Leave a comment:


  • dkasak
    replied
    Originally posted by Cape View Post
    My God... We should all ditch GTK/Gnome and start developing on Enlightenment...

    That thing FLIES on my potato netbook, whereas Gnome struggles even on my i7 😓

    Performance should be the driving force!!
    I've used Enlightenment from the very early ( 0.15 ) days, and it's always been unique and performant. Gnome is my fallback option - for when I break my ( git ) build of Enlightenment, or when something else is playing up. While I much prefer E over Gnome, performance has nothing to do with it, because Gnome has always performed well for me, on a range of desktops and laptops, with a range of underpowered Intel GPUs or more impressive AMD GPUs.

    Leave a comment:


  • Cape
    replied
    My God... We should all ditch GTK/Gnome and start developing on Enlightenment...

    That thing FLIES on my potato netbook, whereas Gnome struggles even on my i7 😓

    Performance should be the driving force!!

    Leave a comment:


  • c117152
    replied
    Originally posted by Michael View Post

    Was going by what Rasterman wrote with "Massive improvements to Wayland support" in the release announcement but also not elaborating.
    Probably referring to the numerous EFL fixes: https://www.enlightenment.org/news/efl-1.22.3

    Leave a comment:


  • skeevy420
    replied
    Gonna have to give this a compile sometime this week. I've always been very partial to Enlightenment and I'm currently breaking my rule about only having one GUI environment setup and working (Plasma). Where other people would go towards XFCE, Openbox, LX something or other, etc for a lightweight setup, I'd always go to Enlightenment.

    I will say that I do not like a lot of their default settings. Once those are tweaked a bit it's a really nice setup. Different and takes some getting used to, but nice.

    Leave a comment:


  • Michael
    replied
    Originally posted by tildearrow View Post
    "Massive"... ...think you can elaborate a little?
    Was going by what Rasterman wrote with "Massive improvements to Wayland support" in the release announcement but also not elaborating.

    Leave a comment:


  • tildearrow
    replied
    "Massive"... ...think you can elaborate a little?

    Leave a comment:

Working...
X