Patterns in software development for sharing data between threads

If data is not shared in concurrent applications, data races cannot arise. No sharing means the thread is working with local variables. This can be accomplished by copying the data, using thread-local memory, or transferring a thread’s result to its associated future over a protected data channel.

Rainer Grimm has been working as a software architect, team leader and training manager for many years. He likes to write articles on the programming languages C++, Python and Haskell, but also likes to speak frequently at specialist conferences. On his blog Modernes C++ he deals intensively with his passion for C++.

The patterns in this section are fairly obvious, but I’ll introduce them with a brief explanation for completeness. Let’s start with Copied Value.

Copied Value

When a thread gets its arguments by copy rather than by reference, access to the data does not need to be synchronized. There are no data races and no data lifetime issues.

Data races with references

The following program creates three threads. One thread gets its argument by copy, the other by reference, and the last one by constant reference.

// copiedValueDataRace.cpp

#include 
#include 
#include 
#include 

using namespace std::chrono_literals;

void byCopy(bool b){
    std::this_thread::sleep_for(1ms);             // (1)
    std::cout << "byCopy: " << b << 'n';
}

void byReference(bool& b){
    std::this_thread::sleep_for(1ms);            // (2)
    std::cout << "byReference: " << b << 'n';
}

void byConstReference(const bool& b){
    std::this_thread::sleep_for(1ms);            // (3)
    std::cout << "byConstReference: " << b << 'n';
}

int main(){

    std::cout << std::boolalpha << 'n';

    bool shared{false};
    
    std::thread t1(byCopy, shared);
    std::thread t2(byReference, std::ref(shared));
    std::thread t3(byConstReference, std::cref(shared));
    
    shared = true;
    
    t1.join();
    t2.join();
    t3.join();

    std::cout << 'n';

}

Each thread sleeps for one millisecond (1, 2, and 3) before displaying the boolean. Only thread t1 has a local copy of the boolean and therefore has no data race. The output of the program shows that threads t2 and t3 have their Boolean values changed without synchronization.

The obvious idea is that thread t3 from the previous example copiedValueDataRace.cpp just through std::thread t3(byConstReference, shared) can be replaced. The program compiles and runs, but what looks like a reference is a copy. The reason for this is that the Type Traits feature std::decay applied to each thread argument. std::decay performs implicit lValue to rValue, array-to-pointer and function-to-pointer conversion of its type T through. In particular, in this case it calls the function std::remove_reference on the data type T on.

The following program perConstReference.cpp uses a non-copyable data type NonCopyableClass.

// perConstReference.cpp

#include 

class NonCopyableClass{
    public:

    // the compiler generated default constructor
    NonCopyableClass() = default;

    // disallow copying
    NonCopyableClass& operator = 
      (const NonCopyableClass&) = delete;
    NonCopyableClass (const NonCopyableClass&) = delete;
  
};

void perConstReference(const NonCopyableClass& nonCopy){}

int main(){

    NonCopyableClass nonCopy;                      // (1)

    perConstReference(nonCopy);                    // (2)
    
    std::thread t(perConstReference, nonCopy);     // (3)
    t.join();

}

The object nonCopy (1) cannot be copied. That's fine if I use the function perConstReference meet their Argument nonCopy (2) calls because the function takes its argument by constant reference. If I have the same function in the thread t (3), GCC produces a verbose compiler error with more than 300 lines:

The main part of the error message is in the middle of the screenshot in a red, rounded rectangle: "error: use of deleted function". The class copy constructor NonCopyableClass is not available.

Anyone who borrows something must ensure that the underlying value is still available when it is used.

Reference lifetime issues

If a thread takes its argument by reference and one Thread detach calls, extreme caution is required. The little program copiedValueLifetimeIssues.cpp has undefined behavior.

// copiedValueLifetimeIssues.cpp

#include 
#include 
#include 

void executeTwoThreads(){                                   // (1)
    
    const std::string localString("local string");          // (4)
    
    std::thread t1([localString]{
        std::cout << "Per Copy: " << localString << 'n';
    });
    
     std::thread t2([&localString]{
        std::cout << "Per Reference: " << localString << 'n';
    });
    
    t1.detach();                                           // (2)
    t2.detach();                                           // (3)
}
    
using namespace std::chrono_literals;

int main(){
    
    std::cout << 'n';
    
    executeTwoThreads();
    
    std::this_thread::sleep_for(1s);
    
    std::cout << 'n';
    
}

executeTwoThreads (1) starts two threads. Both threads become detached (2 and 3) and return the local variable localString from (4). The first thread binds the local variable by copy and the second by reference. For simplicity, in both cases I used a lambda expression to bind the arguments. Because the function executeTwoThreads does not wait for the two threads to finish, the thread relates t2 to the local string bound to the lifetime of the calling function. This leads to undefined behavior. Oddly, with GCC, the maximum optimized executable -O3 seems to work, while the non-optimized executable crashes.

Thanks to thread-local storage, a thread can easily work on its data.

Thread local storage

Thread-local storage allows multiple threads to share local storage through a global access point. By using the specifier thread_local a variable becomes a thread-local variable. That means you die thread-local variable without synchronization. Suppose you want the sum of all elements of a vector randValues calculate. This can be easily implemented with a range-based for loop.

unsigned long long sum{};
for (auto n: randValues) sum += n;

For a PC with four cores, you turn the sequential program into a concurrent program:

// threadLocallSummation.cpp

#include 
#include 
#include 
#include 
#include 
#include 

constexpr long long size = 10000000;   

constexpr long long fir =  2500000;
constexpr long long sec =  5000000;
constexpr long long thi =  7500000;
constexpr long long fou = 10000000;

thread_local unsigned long long tmpSum = 0;

void sumUp(std::atomic& sum, 
           const std::vector& val, 
           unsigned long long beg, unsigned long long end) {
    for (auto i = beg; i < end; ++i){
        tmpSum += val[i];
    }
    sum.fetch_add(tmpSum);
}

int main(){

  std::cout << 'n';

  std::vector randValues;
  randValues.reserve(size);

  std::mt19937 engine;
  std::uniform_int_distribution<> uniformDist(1, 10);
  for (long long i = 0; i < size; ++i) 
      randValues.push_back(uniformDist(engine));
 
  std::atomic sum{}; 
  
  std::thread t1(sumUp, std::ref(sum), 
                 std::ref(randValues), 0, fir);
  std::thread t2(sumUp, std::ref(sum), 
                 std::ref(randValues), fir, sec);
  std::thread t3(sumUp, std::ref(sum), 
                 std::ref(randValues), sec, thi);
  std::thread t4(sumUp, std::ref(sum), 
                 std::ref(randValues), thi, fou);   
  
  t1.join();
  t2.join();
  t3.join();
  t4.join();

  std::cout << "Result: " << sum << 'n';

  std::cout << 'n';

}

You wrap the range-based for loop in a function and let each thread hold a quarter of the sum in the thread_local-Variable tmpSum calculate. The line sum.fetch_add(tmpSum) (1) finally sums all the values in the atomic sum. More about thread_local Memory can be read in the article "Thread-local data".

Promises and futures share a protected data channel.

Futures

C++11 offers futures and promises in three variants: std::async, std::packaged_task and the couple std::promise and std::future. The future is a protected placeholder for the value that the promise sets. From a synchronization point of view, the crucial property of a promise/future pair is that a protected data channel connects the two. There are some decisions to be made when implementing a future.

A future can explicitly express its value with the get-call query and it
can start the calculation lazy (only on request) or eager (immediately). Just the promise std::async supports lazy evaluation with a launch policy.

auto lazyOrEager = std::async([]{ return "LazyOrEager"; });
auto lazy = std::async(std::launch::deferred, 
  []{ return "Lazy"; });
auto eager = std::async(std::launch::async, []{ return "Eager"; });

lazyOrEager.get();
lazy.get();
eager.get();

If I don't specify a launch policy, the system decides whether to launch the job immediately or on demand. With the launch policy std::launch::async a new thread is created and the promise starts working immediately. This is in contrast to the launch policy std::launch::deferred. The call eager.get() starts the promise. Also, the promise is executed on the thread that received the result get requests.

More about futures in C++ can be found in the article "Asynchronous Function Calls".

What's next?

Data races cannot happen if data is not written and read at the same time. In my next article, I will write about patterns that help protect you from change.

(rme)

C/C++Patterns

Patterns in software development for sharing data between threads

Copied Value

Thread local storage

Futures

What's next?

Share this:

Related

FAES plane that was participating in the Soldier’s Day festivities crashed at the Mons. Romero airport – La Página newspaper

Vera Fauna. breaking latest news of his concert in Zamora (2023)

You may also like

Leave a Comment Cancel Reply