Announcement

Collapse
No announcement yet.

KLANG: A New Linux Audio System For The Kernel

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #61
    Originally posted by datenwolf View Post
    So what does this mean for KLANG? KLANG can be compiled with either 40 bits or 48 bits per sample processing, with a footroom of 8 bits (this is for reversible attenuation of up to 48dB, which is plenty).
    Apparently you don't follow any of the trends in digital audio. Just as almost everybody else (except Avid/Digidesign) has adopted floating point as THE standard format for computer-based digital audio, you want to revert Linux to fixed point. This is supposed to be smart?

    You're going to explain to all the developers of every pro-audio and music creation software who might consider Linux as a platform for their work that their samples will be converted twice, once from the floating point format that their code uses into fixed point and then again into integer format before it hits the audio hardware? And that when they share audio data between applications, which will both be using floating point format, it will be converted to and then from fixed point as an intermediate?

    Could you at least spend several months hanging out with audio developers before you try to redesign the kernel subsystems that we rely on?

    Comment


    • #62
      Originally posted by Khudsa View Post
      Is this post on Steam: Steam Linux thread, a new audio project, from a developer of your group?
      This is indeed the case.

      Comment


      • #63
        Looking at it from an end-user perspective, what will change?

        I use ALSA + PulsaAudio and I have never experienced any latency on native programs...
        Heck, PulseAudio even got my "Windows-only" headphone (USB one, which without "special" software, doesn't even work on Windows) working via. software mixing, and it also got the microphone and the Dolby feature working flawlessly!

        Linux Audio isn't as bad as everyone says it is, at least not for me, but then again, the World doesn't revolve around me.

        I have never written any Linux-specific software which requires audio, therefore it would be nice if someone could give their insight about writing software for our current sound system on Linux.

        Comment


        • #64
          Originally posted by datenwolf View Post
          You'll hardly find 256 audio sources playing at the same time. So we can in practice mix even more stuff, without problems.
          BTW, you might want to check out: http://www.3daudioinc.com/3db/showth...00-bands-of-EQ Notice that the URL alone refers to the 384 inputs on this puppy.

          I don't want to advocate that every system API needs to be designed to handle the most extreme possible use cases, but seriously - 200+ channels is really fairly typical in a major movie post-production environment. If we were discussing PulseAudio, this kind of thing would just be "not part of the problem space". But you want to replace the entire kernel side system, not just the consumer/desktop audio server.

          You'll note that the above-referenced mixing console runs Linux, though full disclosure requires me to note that it does not use ALSA.

          Comment


          • #65
            Shame

            I guess you embarrassed datenwolf to the point he will not respond.
            That's the end of KLANG then!

            Comment


            • #66
              Originally posted by PaulDavis View Post
              BTW, you might want to check out: http://www.3daudioinc.com/3db/showth...00-bands-of-EQ Notice that the URL alone refers to the 384 inputs on this puppy.

              I don't want to advocate that every system API needs to be designed to handle the most extreme possible use cases, but seriously - 200+ channels is really fairly typical in a major movie post-production environment. If we were discussing PulseAudio, this kind of thing would just be "not part of the problem space". But you want to replace the entire kernel side system, not just the consumer/desktop audio server.

              You'll note that the above-referenced mixing console runs Linux, though full disclosure requires me to note that it does not use ALSA.

              While I think its absolutely fabulous your mixing, with hardware, that being controlled by software.

              Whats your real soft roundtrip latency if you use all software monitoring, cuase frankly., 8 samples smells like bullshit to me.and I've been around pro audio since the 486 was a twinkle in Intels eye.

              Comment


              • #67
                Originally posted by Thatguy View Post
                While I think its absolutely fabulous your mixing, with hardware, that being controlled by software.

                Whats your real soft roundtrip latency if you use all software monitoring, cuase frankly., 8 samples smells like bullshit to me.and I've been around pro audio since the 486 was a twinkle in Intels eye.
                It's not bullshit. While you may think that it is impressive that you have been around proaudio since the 486 was a twinkle in Intel's eye(which i find funny..lol), i find it far more interesting that Harrison Consoles has been an industry leader (for both Analog/digital Consoles) for decades. Maybe you should have a look at some of their products or whitepapers. here is a link to their website;

                http://www.harrisonconsoles.com/joom...tpage&Itemid=1

                the product in question using 8 samples, which you think is BS, is very likely xdubber, found here;

                http://www.harrisonconsoles.com/joom...d=23&Itemid=57

                You should also take a look at the link Paul provided (scroll halfway down the page), so you can see the physical hardware and how much 'muscle' their is. It makes your proaudio setup look like you are still running a 486 (and i don't even need to know what your hardware is to draw that conclusion).

                http://www.3daudioinc.com/3db/showth...00-bands-of-EQ

                if you google "harrison consoles 8 samples" - you should turn up some whitepapers on the subject as well.
                Last edited by ninez; 08-02-2012, 12:30 AM.

                Comment


                • #68
                  A remark on inter-process communication

                  A remark on inter-process communication, you said:

                  Originally posted by PaulDavis View Post
                  You cannot be serious. Its hard to take anyone seriously who would claim such a thing. You think that two processes both touching an mmap'ed region in user space causes some kind of kernel hell? this is just ridiculous - you make it appear that you don't know how shared memory works at all! the address spaces have the region mapped. when each process touches part of the region NOTHING HAPPENS - its a memory access.
                  His point was indeed incorrect but shared memory in itself is not enough: for several processes to cooperate there must use IPCs to synchronize these access, and these IPC have the "two context switches" latency, so I'm not sure that putting a bigger part of the audio stack in the kernel couldnot reduce the average latency..

                  Also about the push/pull design, IMHO (I'm not an audio programmer) both are important and having low latency push (with low CPU usage) is important too: I'm thinking about a game which wants to have fast audio feedback when a player press a button, so I'm not sure that a push buffer over a pull mechanism is "good enough" for this use case.

                  Best regards.

                  Comment


                  • #69
                    Originally posted by bug! View Post
                    I use ALSA + PulsaAudio and I have never experienced any latency on native programs...
                    Of course not. PulseAudio and ALSA work just fine for the typical consumer workload. Even if you had an audio latency of 150ms or more you would probably only notice it if you were playing a very intense FPS, where there could be a very slight delay in the sounds coming from the game compared to the image you were seeing. Even then it probably wouldn't make much of a difference to you. This all changes when you need to (e.g.) synchronize a playback and a recording track in a DAW, or apply real-time effects to audio on a live concert. You can't have a delay of 100ms between two tracks (unless it's intencional of course) and you can't send the audio mix back to the musicians (on a live concert) with such a delay. It has to be as close as possible to real-time.

                    Comment


                    • #70
                      Originally posted by MickStep View Post
                      I guess you embarrassed datenwolf to the point he will not respond.
                      That's the end of KLANG then!
                      That sounds like Deutschlisch. The correct version of the statement would be: Das is die Ende der Klang Not that I support this kind of comments though.

                      Comment


                      • #71
                        Originally posted by prokoudine View Post
                        That sounds like Deutschlisch. The correct version of the statement would be: Das is die Ende der Klang Not that I support this kind of comments though.
                        It's "das ist das ende vom Klang".

                        (Yes, that reply has zero thread contribution value :-P)

                        Comment


                        • #72
                          Originally posted by RealNC View Post
                          It's "das ist das ende vom Klang".

                          (Yes, that reply has zero thread contribution value :-P)
                          Oh, right. Ende is neutrum, and Klang is maskulinum. Thanks for adjusting

                          Comment


                          • #73
                            Originally posted by renox View Post
                            A remark on inter-process communication, you said:



                            His point was indeed incorrect but shared memory in itself is not enough: for several processes to cooperate there must use IPCs to synchronize these access, and these IPC have the "two context switches" latency, so I'm not sure that putting a bigger part of the audio stack in the kernel couldnot reduce the average latency..
                            i guess you're either not familiar with or not remembering that in audio its very common to use single-reader/single-writer ringbuffers (FIFOs) which do not require synchronization. in addition, JACK is very carefully designed so that it does not use server->client->server->client->server design, but rather allows clients to hand off control to the next client (server->client->client->server), thus minimizing context switches. the IPC mechanism is not needed for synchronization per se, it is required to wake up each participating process (although you could view this as a kind of synchronization). a design involving kernel-side mixing will involve N-2 context switches compared to JACK.

                            its also important to note that context switches do not by themselves increase latency. they do reduce the number of cpu cycles available for processing audio before a deadline is missed. since in most situations, there are ample cpu cycles available, this only becomes important when loading up the cores with huge amounts of very expensive (typically pro-audio) processing (reverbs, for example, tend to be very expensive). even then, the cost of a context switch these days is mostly related to the size of the working set of the switched-to thread, so if you have an audio processing thread that doesn't touch much data, the overhead of a switch is very, very, very small. much smaller than would be incurred by many typical audio processing operations. unless you plan to chain up dozens of threads with context switches, the overhead is really small, which means that the impact on lowest achievable latency is small.

                            what *is* important about context switches is that they provide an opportunity for the kernel to get scheduling decisions wrong. this is a real problem, though its getting better all the time.

                            Also about the push/pull design, IMHO (I'm not an audio programmer) both are important and having low latency push (with low CPU usage) is important too: I'm thinking about a game which wants to have fast audio feedback when a player press a button, so I'm not sure that a push buffer over a pull mechanism is "good enough" for this use case.
                            It seems to have worked just fine for the legions of games based on DirectSound on Windows. but that aside, i think it would be fantastic for ALSA to provide CoreAudio-like access to the h/w buffer used by most audio interfaces these days. That requires adding a DLL/PLL to estimate the position where the hardware is currently reading/writing, which ALSA does not have at this time. If it did, it would be possible to use the high res timer interrupt to provide arbitrary latency for different applications (ie. latency would not be based on some multiple of the audio interface interrupt interval, as it is now) AND it would be possible for read/write calls (i.e. a push model API) to get very very close to the current read/write location. Adding this to ALSA would be a MAJOR contribution to low level linux audio, but is also a lot of work. It does not, however, require tearing up the infrastructure and device drivers that we already have.

                            Comment


                            • #74
                              Hello, Im not an engineer, but I have serious doubts in your viewpoint.

                              Originally posted by datenwolf View Post
                              A oftenly found misconception is that floating point equals precision. What floating point gives you is a fixed number of significant (binary) digits, the mantissa, and a value scale defined by the exponent. In a floating point number you never get more precision than the mantissa can hold. The mantissa is usually normalized, which means it's value is interpreted as being in the range 01 and the most significant bit shifted to the upper boundary. In addition you have a sign bit which gives you an effective mantissa value range of -11 quantized by the integer range given by the mantissa bits. As long as the effective value of a floating point number stays within the range -11 the scaling exponent is <=0 and what you actually have is the equivalent of an integer of the same number of bits as the mantissa has (divided by the integer value range).

                              Of course with small absolute values, and hence negative exponent the number resolution increases. But you don't benefit from this, because you're dealing with a quantized signal and eventually the signal goes into a DAC which can, by principle not resolve less than one bit. DACs are inherently fixed point, that's the nature of their circurity.
                              I agree here, that original floating point is not meant for precision.
                              But I have important sidenote : floating comes from analogue world, dusty world of warm lamp glow. Analogue, sans disadvantages of low lifespan due to instability to "dust", has two serious advantages over digital:
                              1* ability to store much more information per state
                              2* "softness", which comes from (1)

                              This is pure analog signal, which lost to digital signal, as digital processing evolved and could take over digital's major drawback - low information density.

                              The "float" that we currently have is not original "analog", but fake analog form, that is injected into digital form. This is why, it, contrary to original analog, has stability, but makes up for it with its "mantissa".

                              I, personally, look at this "fake analog" as a form of smart digital integer value, an integer value that has added built-in position compression. At cost of a fraction of bit capacity.

                              Which means more place for mantissa, more precision can be saved. BUT, "fake analog" is always precise, unlike what you claim, because it roots on the digital form.

                              As such "fake analog", aka floating point, has following true advantages over integer:
                              - can compress insignificant part of value. Example 2.34+E30 will fit 32bit floating point FINE, unlike int, that will instantly overfollow unless constantly asserted. So, floating point is operating much more flexible.
                              - can split(divide) values much more accurate. Example 75/6. Only floating point will deliver precise value.

                              The problem which "fake analog" aka float introduces:
                              - division operations which result in "unendless" value, cut by mantissa will be imprecise at tail. In case of int, there will be simply rounded,cut, inaccurate value. Example 1/3. This comes from nature of "fake analog" - its digital base.

                              The advantages of integer over "fake analog" aka float:
                              - you have correctly noted, that when occupying whole bit capacity, integer value stores more data than floating point, because in this case only mantissa matters - in int, the whole value is "mantissa".
                              - the value is delivered either complete, or broken upon "sharp digital bricks", aka mod value.

                              The problem with large numbers which you introduced, will be discussed lateron:..


                              Originally posted by datenwolf View Post
                              So why use floating point at all then? Because say if the absolute value is larger than what would fit into the integer range? Then a integer would either wrap or saturate (depending on operations used). A floating point number however uses a exponent >0 then. But this means that now you must drop bits of your mantissa. If the exponent is 1, then your mantissa resolution has been halved, if the exponent is 2 it's only 1/4 of the mantissa resolution and so on.

                              In audio applications float is used, because this allows you to mix several signals without fearing of saturating or wrapping them. However this means a loss in precision. Also it allows to apply a wide range of gain without thinking too much about it.

                              But let's look at this from an engineering standpoint. An IEEE754 32 bit floating point has 23+1 bits of mantissa. That's the precision you get. Add to this 8 bits exponent, allowing to expand the value range up to 2^127, which equates to some 20*log10(2^127)=770 dB of additional dynamic range (only usable if the each signal was a square wave). You'll never find this situation in audio.
                              For same cases, where mantissa of float and bit range(pseudo-mantissa) of int are capable to hold the value, they both will deliver same accurate results.
                              However, when the range will be outside of bit capacity, int will fail by clipping (for high-end) or by end-cutting (for low-end) - but float will deliver accurate result with acceptable error rate.

                              Again, if input value has higher bit capacity requirements than those available, both int and float will deliver inaccurate results, but that of the float is actually USABLE in audio.


                              Originally posted by datenwolf View Post
                              This is probably the number one key lesson told in every numerics lecture given (if not, it's a bad lecture): Using floating point makes only sense if your numbers vary over a large range and are not of similar scale. Especially never subtract floating points of similar absolute value. If the numbers are all of similar value, or in a narrow range and precision is pinnacle never use floating point. Use fixed point then. If you are a pure computer science student you actually might not have learnt this, because computer science doesn't deal with such practical problems . But if you're in engineering or physics, then you must know how to work with digital numbers.
                              I'm sorry, I was never at numerics lectures, but I understand quite good, that using floats on large range numbers is only acceptable in case those numbers:
                              1) have significant part that fits inside float mantissa
                              2) have insignificant part, which precision does not matter AND that which folds within capacity of float exponent space

                              This is very sleek case.

                              The real applications of integer vs float come depend upon the source of values and their significance.
                              Values which have only one significant part, with unimportant rest - are better processed (lossy-compressed) by float.
                              Values which can grow huge, and are all significant on every position - are unsuitable for all types. Here, they should be fragmented into packets and stored in integer.
                              Values which have specific length and are all significant - are better processed by int. Because floats unused exponental part will only waste bit bandwith.
                              Values which have specific length and but which can or will be divided into unendless range - are better processed by float. And this is AUDIO. See:

                              If you implement an audio system using
                              a) integer
                              b) float

                              The float system will loose only in case the significant part grows so large, it overfollows (does not fit) the mantissa. Even then, you will get acceptable for audio signal clip.
                              But the int (a) system will loose:
                              a) on sheer processing size, float is way more compact. Means more CPU time for integer-based version. Also, float is way more flexible, it saves more detail when it happens - with int you will have "unused bits" much more often.
                              b) on sound precision, because you can't divide 75/6 correctly using int. You will end-up reinventing a cycle: same bit-shifting mechanism that is already present in float. Means more clips and inaccurate sound processing using int.
                              c) for sound streams which luckily pass ideally within both approaches, no system will have advantage. As this float is really a "fake analog", and all bits will be divided just as correctly as with int.

                              So, my conclusion, as of non-professional non-engineer, in audio processing float brings more advantages. As if walking on hands is possible, but walking on feet is more efficient.

                              Comment


                              • #75
                                Originally posted by crazycheese View Post
                                [...]
                                why do you keep referring to floating point as 'fake analog'?! I'm not arguing your post, since i am not great at mathematics - yet, i have never heard anyone or ever read any definition of floating point that included 'fake analog'.

                                I've always been under the impression that Floating point is more accurately described as 'scientific notation' in computing, and the benefit FP being floating point values have advantages over both integer and fixed-point calculations because they are much more 'granular', and thus can hold a much wider range of values.

                                anyway, i just thought it was silly that you keep refering to FP as 'fake analog' in your post, which doesn't even really come across as a layman term, but rather just seems like it isn't a term that should've been used to begin with.

                                Comment

                                Working...
                                X