Exploring Persistent Data Structure in Persistent Memory (PMem) for a Lock-Free Multi-Thread Security
About the Author
Dr. Yang Jun has more than 10 years of rich research and practical experience in database and storage system; He is now working in the 4Paradigm (https://en.4paradigm.com/) as a system architect and is also a core member of a hierarchical storage technology community, MemArk (https://memark.io/en ). Recently, due to his outstanding contribution in the core community of the Persistent Memory Programming (https://pmem.io/), he has become China’s first community core contributor that has been adopted by the community (reviewer). This article will introduce the author’s latest code contribution, which has a far-reaching impact on the follow-up technology development of the community.
1. Background
Persistent Memory (PMem) can ensure that the data written to PMem will not be lost after power failure. However, the data written to PMem is often written to a volatile CPU cache first, and then brushed into PMem through a series of CPU instructions. Due to the hardware limitations of PMem and CPU, the atomicity of the write operation cannot be ensure when writing more than 8 bytes of data to PMem (When the power is lost during the process of the data writing, we cannot guarantee that the data is completely persistent on PMem). As a result, ensuring the data consistency of the persistent data structure in PMem is very challenging. If the persistent data structure are accessed by multi-threads, ensuring data consistency will become more complex. This paper mainly describes the consistency and visibility of persistent data under this background.
This article is suitable for developers who have a basic understanding of persistence programming. It mainly includes the following contents:
- This paper introduces the problems and solutions of data visibility and consistency in PMem lock-free programming.
- This paper introduces a PR that is recently incorporated into libpmemobj-cpp by the author of this paper, which is a built-in Atomic Persistent Pointer for the convenience of realizing Single-Writer-Multiple-Reader (SWMR), a multi-threaded persistent data structure. You can refer to the discussion and development process of this PR here.
2. The PMem Lock-Free Programming
Data visibility is one of the most important difficulties in PMem lock-free programming. For example, if a thread modifies an in-memory data using the STORE instruction, the new data may be read by another thread when it is not persisted to PMem (still in the CPU cache). Suppose the following two threads use atomic, which brings “atomic_write” and “atomic_ read” to write and read data:
//Thread 1
//Pmem - > a is initialized to 0atomic_ store(&pmem->a, 1); // Visibility: Yes, persistence: unknown
pmem_ persist(&pmem->a, sizeof(pmem->a)); // Visibility: Yes, persistence: Yes//Thread 2
//Pmem - > b initialized to 0if (atomic_load(&pmem->a) == 1) {
pmem - > b = 1; // visibility: Yes, persistence: unknown
pmem_persist (& pmem - > b, sizeof (pmem - > b)); // visibility: Yes, persistence: Yes
}
According to the different time that the program crash and suspension, we analyzed various possibilities of the values of a and b from the perspective of thread 2:
- If the program aborts before thread 1 starts, pmem -> a = 0, pmem -> b = 0
- If the program aborts after both threads are completed, pmem -> a = 1, pmem -> b = 1
- If the program aborts after thread 1 ends, pmem -> a = 1, pmem -> b = 0
However, as shown in Figure 1, if considering the data recoverability brought by the persistence feature of PMem, thread 1 has just executed atomic_store and not executed the pmem_persist while, thread 2 executes pmem_persist, after the system is powered down. Thus, pmem -> a is not persistent and pmem -> b is persistent, unexpected situations such as pmem -> a = 0 and pmem -> b = 1 will occur.
The main reason for this inconsistency between data and program logic is that the operation of thread 2 depends on non-persistent data. Therefore, one of the ways to avoid such data inconsistency is to perform an additional persistence operation immediately after reading the data. For example, thread 2 can be changed to:
//Thread 2
//pmem - > b initialized to 0if (atomic_load(&pmem->a) == 1) {
pmem_persist (& pmem - > a, sizeof (pmem - > a)); // visibility: Yes, persistence: Yes
pmem - > b = 1; // visibility: Yes, persistence: unknown
pmem_persist (& pmem - > b, sizeof (pmem - > b)); // visibility: Yes, persistence: Yes
}
This method can effectively avoid pmem → a = 0 and pmem → b = 1 (provided that there is only one write thread, readers can think about what problems this method will have if there are more than one write thread). But the cost is not small, because each read requires an additional persistence operation. In order to further optimize the performance, we propose a persistent smart pointer, which not only supports atomic operations, but also only performs necessary persistent operations according to the actual situation.
3. Persistent Pointers that Support Atomic Operations
The following is the specific implementation logic of intelligent pointer supporting atomic operation:
1. In order to support lock free programming, we must use 8 bytes as the size of the persistent pointer, but in pmdk and libpmemobj-cpp, the persistent_ptr available in the public data is a 16 byte “wide pointer”. After communicating with the Intel team, we learned a new persistence pointer that is still in the experimental testing stage: self_relative_ptr (self-offset pointer). This persistent pointer cleverly reduces the persistent pointer to 8 bytes by the offset (8 bytes) between the address of the saved data and the address of the pointer object itself in PMem, which brings the possibility of supporting atomic operation to the persistent pointer.
2. Based on the self_relative_ptr, we further find that the data addresses supporting atomic operations must be 8-byte aligned, so the lower 3 bits of the address are always 0. Thus, we reuse the lowest bit of the pointer address as dirty_flag to distinguish pointers that need to be persisted before reading. Finally, we implemented a persistent pointer that supports atomic operations: atomic_persistent_aware_ptr. The two most important APIs of this class are store / load, which is similar to the atomic standard class. It can read and write atomically the self_relative_ptr object where the specific process is as follows:
a. Store
- Modify the incoming self_relative_ptr address. Set the lowest bit to 1, indicating that the pointer is not persistent
- The modified self_relative_ptr is written by atom and saved to the internal atomic object
b. Load
- The internal atomic object is obtained through atomic reading, self_relative_ptr
- If this self_relative_ptr lowest value is 1, a persistent operation is performed, and CAS (compare and swap, an atomic operation supported by the standard atomic class, which is assigned when the condition is true) is used to clear the lowest PTR of self_relative_ptr in the internal atomic object
- Returns the self_relative_ptr after the lowest bit is cleared
3. It is not difficult to find that the above operations are persistent only when the lowest bit of self_relative_ptr is 1 during Load, and the lowest bit is cleared to 0 after persistence, which effectively avoids repeated persistence operations and ensures that the data is persistent when it is visible.
4. Readers may also notice that we have separated the persistence operation from the write operation (Store), which can be said to be an optimization for the scenario of writing more and reading less. Correspondingly, we have also implemented an optimization for the scenario of writing less and reading more. For details, please refer to the source code.
The above implementation logic has been integrated into libpmemobj-cpp. For the specific source code, please refer to: atomic_persistent_aware_ptr. On the basis of this persistent pointer, it will be very simple to correctly implement the above example without data inconsistency of a = 0 and b = 1:
//Using the transaction of libpmemobj-cpp to allocate PMem objects can ensure no external memory leakage
self_relative_ptr<int> one;
try {
pmem::obj::transaction::run(pop, [&] {
one = nvobj::make_persistent<int>(); //allocate an int in PMem
*one = 1; //Set to 1
pmem_persist(one, sizeof(int)); //Persist
});
} catch (...) {
ASSERT_UNREACHABLE;
}//Thread 1
atomic_ persistent_ aware_ ptr<int> a;
a.store(one); // Directly use the store function provided by atomic_persistent_aware_ptr//Thread 2
atomic_ persistent_ aware_ ptr<int> b;
If (* (a.load()) = = 1) { // directly use the Load function provided by atomic_persistent_aware_ptr
b.store(one);
}
4. Read More
· MemArk community official website: https://memark.io/en