Easy parsing of sysfs to get CPU topology data using libsystopo

20 01 2011

When you are working with main memory it is crucial to make sure that all the data structures are correctly sized and aligned. A typical approach is to create blocks of data that are processed independently. For the developer, the question is how large such blocks should be? The answer is that those blocks should always be either cache sized.

Now, how large is the cache on the system you are using? Either you can go for experiments detecting the different cache levels and the cache line sizes, or if you are happy to have a Intel Linux system at hand, to simply explore the information as it is stored in the sysfs filesystem exported by the Kernel.

However, parsing this information at development time might be ok, at run-time the best way is to adjust the system settings based on the actual configuration. At the moment, libudev does not support to read the information, so your off to yourself.

To avoid that everybody writes the same code I took some time to write a small library that reads the information about the CPU caches and the CPU topology and allows to easily process this information in your program. You can find the most current version of the code as usual at Github.

#include <systopo.h>
using namespace systopo;

int main(void)
{
    System s = getSystemTopology();
    return 0;
}

The System definition parses all data from sysfs and can be reviewed here:

    struct Cache
    {
        size_t coherency_line_size;
        size_t level;
        size_t number_of_sets;
        size_t physical_line_partition;
        size_t size;
        size_t ways_of_associativity;
    
        std::string type;
    
    };

    struct Topology
    {
        size_t core_id;
        size_t physical_package_id;
        std::vector<size_t> core_siblings;
        std::vector<size_t> thread_siblings;
    };
    
    struct CPU
    {
        std::vector<Cache> caches;
        Topology topology;
    };


    struct System
    {
        std::vector<CPU> cpus;
        std::vector<size_t> online_cpus;
        std::vector<size_t> offline_cpus;
    };

For more information about the meaning of the Topology please refer to the Kernel documentation. The meaning for the CPU cache fields should be clear or refer to “What every programmer should know about memory”.

If you have feedback, comments or ideas, I’m glad to respond!

– Martin