Per its 01.org project page, "NumaTOP is an observation tool for runtime memory locality characterization and analysis of processes and threads running on a NUMA system. It helps the user characterize the NUMA behavior of processes and threads and identify where the NUMA-related performance bottlenecks reside."
Aside from needing this new user-space utility, for those wishing to do this NUMA memory profiling also need to patch their kernel with a perf load latency patch. This perf load support is expected to be merged into a future Linux kernel release.
More details on the NumaTOP development and release from Intel can be found within this kernel mailing list announcement. "Performance analysis engineers know that NUMA can seriously impact performance and that NUMA performance analysis can be challenging. We've realized that currently there isn't an easy-to-use tool that lets us easily observe whether NUMA-related issues exist and, if so, where the NUMA bottleneck(s) reside. It can be quite challenging, especially in complex server environments."