Tuesday, November 19, 2019

SPO600 Project Post 2

After the initial testing. I realized I was doing the benchmarking wrong. I was doing a perf report on redis-cli (tool to interact with Redis server) instead of the actual server. A proper benchmarking displayed the following:


I did a quick grep -rn 'redis-stable/' -e 'je_malloc' to find what the code looked like for the library it was using:
je_malloc_usable_size(JEMALLOC_USABLE_SIZE_CONST void *ptr) {
        size_t ret;
        tsdn_t *tsdn;

        LOG("core.malloc_usable_size.entry", "ptr: %p", ptr);

        assert(malloc_initialized() || IS_INITIALIZER);

        tsdn = tsdn_fetch();
        check_entry_exit_locking(tsdn);

        if (unlikely(ptr == NULL)) {
                ret = 0;
        } else {
                if (config_debug || force_ivsalloc) {
                        ret = ivsalloc(tsdn, ptr);
                        assert(force_ivsalloc || ret != 0);
                } else {
                        ret = isalloc(tsdn, ptr);
                }
        }
        check_entry_exit_locking(tsdn);
        LOG("core.malloc_usable_size.exit", "result: %zu", ret);
        return ret;
}

Now I was pretty lost, I asked the professor just what kind of optimization I can look at. I was given a few suggestions
  • check if Redis was trying to allocate new memory for set operation
    • if it did, could we just overwrite the already allocated memory block for another set operation?
  • check if there are a lot of deallocated memory, if there is perhaps we can allocate different size memory blocks for set operations
The function I posted above was used a lot, digging a bit deeper into the code that used the je_malloc library. I discovered it was quite complicated. Redis allows for users to define threads for operations which vastly speeds up the operations it uses.

In a perfect world, when writing data the different threads will always access different blocks of memory. However, the world is not perfect and there is a chance threads will attempt to write to the same area in memory. To avoid this, the program uses something called arenas specifically for different threads to use, to avoid stepping on each other's toes. 

Memory fragmentation is also an issue, because Redis is used as a database, having data stored in memory contiguously will make get operations more efficient. jemalloc improves on malloc by trying to minimize memory fragmentation.


No comments:

Post a Comment

Contains Duplicate (Leetcode)

I wrote a post  roughly 2/3 years ago regarding data structures and algorithms. I thought I'd follow up with some questions I'd come...