Announcement
Collapse
No announcement yet.
Improving The Linux Kernel's Memory Performance
Collapse
X
-
BTW, wasn't SSE3 a compulsory part of AMD64? If so, then this would presumably go into any AMD64 kernel with no need to check processor flags.
-
Originally posted by Smorg View Postmemcpy? As in... memcpy from string.h? What does this have to do with the kernel? Is there some system call that's also called memcpy?
Leave a comment:
-
Originally posted by phoronix View PostPhoronix: Improving The Linux Kernel's Memory Performance
Over the past few days there's been an active discussion on the Linux kernel mailing list surrounding the memory copy (the memcpy function to copy blocks of memory) performance within the kernel. In particular, an application vendor claims to have boosted their application (a video recorder) performance by 12% when implementing an "optimized" memory copy function that takes advantage of SSE3...
http://www.phoronix.com/vr.php?view=OTgwMQ
Other than that, sounds cool. I had no idea hitting the SSE was so costly. I suppose it makes sense that they were intended for rather larger data sets, but still, hadn't occured to me.
Leave a comment:
-
memcpy? As in... memcpy from string.h? What does this have to do with the kernel? Is there some system call that's also called memcpy?
Leave a comment:
-
Originally posted by abral View PostI did something like this in my kernel and the performance was really impressive.
At the time, I used CPUID to know if the CPU supported MMX or SSE or SSE2 and set malloc to use what was supported.
I hadn't any graphics driver, so drawing operations were really slow.
Without SSE2 I couldn't do any acceptable graphic operation, instead with SSE2 I could draw windows and move them flawlessy.
Probably this was due to the fact that there were only few threads and the data to move was consistent (1280 x 1024 x 4 bytes).
Leave a comment:
-
I did something like this in my kernel and the performance was really impressive.
At the time, I used CPUID to know if the CPU supported MMX or SSE or SSE2 and set malloc to use what was supported.
I hadn't any graphics driver, so drawing operations were really slow.
Without SSE2 I couldn't do any acceptable graphic operation, instead with SSE2 I could draw windows and move them flawlessy.
Probably this was due to the fact that there were only few threads and the data to move was consistent (1280 x 1024 x 4 bytes).
Leave a comment:
-
Originally posted by dimko View Postcat /proc/cpuinfo |grep sse3
It's sort weird, but i dont seem to have SSE3 on my AMD quad core, however, I think extension was there, just for licensing matters it was called something else. I wonder what is SSE4A and if it absorbs SSE3 into itself?
Leave a comment:
-
to check it:
Originally posted by Dylar View PostBut if I use a prebuilt generic x86_64 kernel provided by my distro, is there a way the kernel could autodetect if my CPU has support for SSE3 at runtime, or do I have to recompile the kernel ?
It's sort weird, but i dont seem to have SSE3 on my AMD quad core, however, I think extension was there, just for licensing matters it was called something else. I wonder what is SSE4A and if it absorbs SSE3 into itself?
Leave a comment:
Leave a comment: