Back to Scaling Ruby Applications guides

Understanding Ruby Threads and Concurrency

Stanley Ulili
Updated on September 10, 2025

Ruby's concurrency model is built around threads, but their execution is managed by a mechanism known as the Global Interpreter Lock (GIL). The GIL ensures that only one thread can run Ruby code at any given moment. This design choice simplifies thread safety within the Ruby interpreter itself but has important implications for how concurrent programs perform.

This structure creates a clear distinction between two types of tasks. For CPU-bound operations that require constant computation, Ruby threads cannot run in true parallelism across multiple processor cores. However, the model is highly effective for I/O-bound work, where a program spends time waiting for external resources like a database or a network API.

When a thread is blocked waiting for an I/O operation to complete, it releases the GIL, allowing another thread to run. This efficient task-switching makes Ruby's concurrency model well-suited for typical web applications, API clients, and other programs where the primary bottleneck is waiting for input or output.

In this article, we'll explore:

  • How Ruby's GIL creates both constraints and opportunities for concurrent programming
  • Why I/O-bound tasks see dramatic performance improvements with threading
  • When to choose processes or Ractors for CPU-intensive work
  • Thread safety patterns that prevent data corruption in concurrent code

Let's dive in!

Prerequisites

You'll need Ruby 2.7 or later installed. The threading model received significant improvements in Ruby 3.0, so newer versions will show better performance characteristics:

 
ruby --version
Output
ruby 3.4.5 (2025-07-16 revision 20cda200d3) +PRISM [arm64-darwin24]

Some examples work better with multiple CPU cores, though Ruby's GIL means the benefits vary by workload type:

 
ruby -e "require 'etc'; puts Etc.nprocessors"
 
4

Setting up the demo project

To effectively demonstrate Ruby's concurrency concepts, you'll build a practical example that showcases the difference between sequential and concurrent execution. This demo will simulate real-world scenarios where your application needs to handle multiple time-consuming operations.

The key insight we'll explore is how Ruby's threading model behaves differently depending on the type of work being performed. By starting with a controlled example using simulated I/O operations, you'll see exactly how threading can transform application performance.

Create your project directory:

 
mkdir ruby-concurrency-demo && cd ruby-concurrency-demo

Create a simple baseline that demonstrates typical application bottlenecks:

baseline.rb
require 'benchmark'
require 'net/http'

def slow_task(name, duration)
  puts "Starting #{name}"
  sleep(duration)
  puts "Finished #{name}"
  name
end

# Sequential baseline
puts "Sequential execution:"
time = Benchmark.realtime do
  slow_task("Task 1", 1)
  slow_task("Task 2", 1) 
  slow_task("Task 3", 1)
end

puts "Total time: #{time.round(2)} seconds"

This baseline simulates a common application pattern where multiple operations must complete before the program can continue. Each slow_task represents operations like API calls, database queries, or file processing that involve waiting for external resources.

The sleep() function is particularly important for this demonstration because it triggers Ruby's GIL release mechanism. When Ruby encounters system calls like sleep, file operations, or network requests, it temporarily releases the Global Interpreter Lock, allowing other threads to execute. This behavior makes I/O-bound operations ideal candidates for threading improvements.

The Benchmark.realtime method measures wall-clock time, giving us the actual duration users would experience. This differs from CPU time measurements and reflects the real-world impact of concurrent execution.

Run this to establish your baseline:

 
ruby baseline.rb
Output
Sequential execution:
Starting Task 1
Finished Task 1
Starting Task 2
Finished Task 2
Starting Task 3
Finished Task 3
Total time: 3.01 seconds

Notice how each task waits for the previous one to complete before starting. This creates a linear execution pattern where the total time equals the sum of all individual task durations. In real applications, this pattern leads to poor user experience as operations that could happen simultaneously are forced to wait in sequence.

This baseline establishes the performance ceiling we'll break through with threading, demonstrating how Ruby's concurrency model can dramatically improve application responsiveness for I/O-bound workloads.

Implementing threading for performance gains

Now that you've established a baseline with sequential execution, you'll modify the code to use Ruby threads and observe the dramatic performance improvement for I/O-bound operations. This transformation demonstrates how Ruby's GIL release mechanism during system calls enables effective concurrent execution.

The threading approach creates separate execution contexts for each task, allowing them to run simultaneously rather than waiting in sequence. While the GIL prevents true parallel execution of Ruby code, the sleep() calls trigger GIL release, enabling other threads to progress while one thread waits.

Modify your baseline file to add the threaded implementation:

baseline.rb
require 'benchmark'
require 'net/http'

def slow_task(name, duration)
  ...
end

# Sequential baseline
puts "Sequential execution:"
time = Benchmark.realtime do
  ...
end

puts "Total time: #{time.round(2)} seconds"

# Threaded execution
puts "\nThreaded execution:"
threaded_time = Benchmark.realtime do
threads = []
threads << Thread.new { slow_task("Task 1", 1) }
threads << Thread.new { slow_task("Task 2", 1) }
threads << Thread.new { slow_task("Task 3", 1) }
threads.each(&:join)
end
puts "Total time: #{threaded_time.round(2)} seconds"
puts "Performance improvement: #{(time / threaded_time).round(2)}x faster"
puts "Time saved: #{((time - threaded_time) / time * 100).round(1)}% reduction"

The Thread.new constructor creates a new operating system thread that immediately begins executing the provided block. Each thread operates independently, managed by Ruby's internal thread scheduler in cooperation with the operating system's thread scheduler.

The join method serves as a synchronization barrier, blocking the main thread until each worker thread completes. This ensures accurate timing measurements and prevents the program from terminating before all work finishes.

Run the enhanced version to see the performance transformation:

 
ruby baseline.rb
Output
Sequential execution:
Starting Task 1
Finished Task 1
Starting Task 2
Finished Task 2
Starting Task 3
Finished Task 3
Total time: 3.02 seconds

Threaded execution:
Starting Task 1
Starting Task 2
Starting Task 3
Finished Task 1
Finished Task 2
Finished Task 3
Total time: 1.0 seconds
Performance improvement: 3.01x faster
Time saved: 66.8% reduction

The output shows clear differences between execution patterns. In sequential execution, each task waits for the previous one to complete, creating a predictable chain where "Starting Task 2" appears only after "Finished Task 1".

The threaded section demonstrates concurrent execution with all three "Starting" messages appearing simultaneously, showing that Ruby launched all threads at once rather than waiting for completion. The variable finish order reflects normal thread scheduling differences.

The 3.01x performance improvement means the same work completed in one-third the time. This approaches the theoretical maximum since tasks now run simultaneously rather than sequentially - total time equals the longest individual task rather than the sum of all tasks.

This effectiveness stems from Ruby's GIL behavior during I/O operations. When sleep() executes, Ruby releases the GIL as a system call, allowing other threads to acquire it and execute concurrently.

Working with thread return values and communication

While the previous example demonstrated performance improvements, real applications need to collect results from threaded operations and coordinate data flow between concurrent processes. Ruby's threading model provides sophisticated communication mechanisms that enable complex coordination patterns while maintaining the performance benefits of concurrent execution.

The fundamental challenge in concurrent programming lies in gathering results from multiple independent execution contexts. Unlike sequential code where each operation naturally flows into the next, threaded operations complete asynchronously and potentially out of order. Ruby addresses this through the value method, which provides a synchronization point that blocks until a specific thread completes and returns its result.

This communication pattern becomes crucial in data processing pipelines, API aggregation services, and parallel computation scenarios where the main thread must wait for all worker threads to complete before proceeding. The value method ensures that results are available when needed while preserving the performance benefits of concurrent execution during the actual work phase.

Create a new file to demonstrate thread communication patterns:

thread_communication.rb
require 'benchmark'

def process_item(item)
  puts "Processing #{item}..."
  sleep(0.5)
  puts "Completed #{item}"
  item.upcase
end

# Sequential processing
items = ['apple', 'banana', 'cherry']
puts "Sequential processing:"

sequential_time = Benchmark.realtime do
  @sequential_results = items.map { |item| process_item(item) }
end

puts "Results: #{@sequential_results}"
puts "Time: #{sequential_time.round(2)} seconds"

This baseline establishes a typical data transformation scenario where each item requires processing before the program can continue. The process_item function simulates realistic operations like data validation, API calls, or file transformations that involve both computation and I/O waiting.

The sequential approach processes items one at a time, with each operation blocking until completion. This creates a predictable execution flow but sacrifices potential parallelism when operations could run concurrently.

Run this to establish the baseline:

 
ruby thread_communication.rb
Output
Sequential processing:
Processing apple...
Completed apple
Processing banana...
Completed banana
Processing cherry...
Completed cherry
Results: ["APPLE", "BANANA", "CHERRY"]
Time: 1.51 seconds

The sequential output demonstrates the familiar linear progression where each item waits for the previous one to complete. The total processing time equals the cumulative duration of all individual operations, creating a clear opportunity for threading optimization.

Now add the threaded version that demonstrates result collection from concurrent operations:

thread_communication.rb
# Previous code...

# Threaded processing with result collection
puts "\nThreaded processing:"
threaded_time = Benchmark.realtime do
threads = items.map do |item|
Thread.new(item) { |i| process_item(i) }
end
@threaded_results = threads.map(&:value)
end
puts "Results: #{@threaded_results}"
puts "Time: #{threaded_time.round(2)} seconds"
puts "Improvement: #{(sequential_time / threaded_time).round(2)}x faster"

The threading implementation introduces two critical phases: thread creation and result collection. During the first phase, items.map creates a thread for each item, passing the item as a parameter to the thread block. This parameter passing mechanism ensures clean data flow from the parent context to each worker thread without shared mutable state.

The second phase uses threads.map(&:value) to collect results from all threads. This operation is crucial because it maintains result ordering based on thread creation sequence, not completion order. Each call to value blocks until the corresponding thread completes, ensuring all results are available before proceeding.

Run the complete version to observe the threading communication pattern:

 
ruby thread_communication.rb
Output
Sequential processing:
Processing apple...
Completed apple
Processing banana...
Completed banana
Processing cherry...
Completed cherry
Results: ["APPLE", "BANANA", "CHERRY"]
Time: 1.51 seconds

Threaded processing:
Processing apple...
Processing banana...
Processing cherry...
Completed apple
Completed cherry
Completed banana
Results: ["APPLE", "BANANA", "CHERRY"]
Time: 0.5 seconds
Improvement: 3.0x faster

The threaded output shows all "Processing" messages appearing simultaneously, confirming concurrent thread startup. Completion messages may vary in order due to normal thread scheduling differences, but the final results maintain perfect ordering because threads.map(&:value) collects results in thread creation order, not completion order.

The 3x performance improvement occurs because processing happens simultaneously rather than sequentially. Total time equals the longest operation duration plus minimal thread overhead, approaching 0.5 seconds instead of the cumulative 1.5 seconds.

This pattern scales effectively - processing 100 items taking 0.5 seconds each would require 50 seconds sequentially but only about 0.5 seconds with threading. Ruby's coordination of thread execution while preserving result ordering makes this suitable for production systems requiring predictable output sequences.

Thread safety and synchronization mechanisms

When multiple threads access shared data simultaneously, race conditions can occur where the final result depends on unpredictable thread timing. These issues don't appear in the previous examples because each thread worked with independent data, but real applications often require threads to coordinate access to shared resources like counters, caches, or data structures.

Ruby provides several synchronization primitives to ensure thread-safe operations. The most fundamental is the Mutex (mutual exclusion), which guarantees that only one thread can execute a critical section of code at any given time. Understanding these mechanisms is essential for building reliable concurrent applications that maintain data integrity under concurrent access.

Race conditions occur because operations that appear atomic in Ruby code often consist of multiple steps at the machine level. A simple increment operation like counter += 1 actually involves reading the current value, adding one, and storing the result. When multiple threads perform these steps simultaneously, they can interfere with each other, leading to lost updates and inconsistent results.

Create a new file to demonstrate thread safety issues and solutions:

thread_safety.rb
# Demonstrate race condition with unsafe counter
class UnsafeCounter
  def initialize
    @count = 0
  end

  def increment
    current = @count
    # Simulate some processing time that makes race condition more likely
    sleep(0.001)
    @count = current + 1
  end

  def value
    @count
  end
end

puts "Testing unsafe counter with concurrent access:"
unsafe_counter = UnsafeCounter.new

threads = 10.times.map do
  Thread.new do
    100.times { unsafe_counter.increment }
  end
end

threads.each(&:join)
puts "Expected result: 1000"
puts "Actual result: #{unsafe_counter.value}"
puts "Lost updates: #{1000 - unsafe_counter.value}"

This example demonstrates a classic race condition scenario. Ten threads each increment a counter 100 times, so the expected result should be 1000. However, the sleep(0.001) call between reading and writing the counter value creates a window where race conditions can occur.

Run this to observe the race condition:

 
ruby thread_safety.rb
Output
Testing unsafe counter with concurrent access:
Expected result: 1000
Actual result: 100
Lost updates: 900

The actual result varies between runs but is consistently less than 1000, demonstrating lost updates caused by race conditions. Multiple threads read the same initial value, increment it, and write back the same result, effectively losing some increments.

Now add the thread-safe solution using a Mutex:

thread_safety.rb
# Previous code...

# Thread-safe counter with Mutex
class SafeCounter
def initialize
@count = 0
@mutex = Mutex.new
end
def increment
@mutex.synchronize do
current = @count
sleep(0.001) # Same processing delay
@count = current + 1
end
end
def value
@mutex.synchronize { @count }
end
end
puts "\nTesting safe counter with concurrent access:"
safe_counter = SafeCounter.new
threads = 10.times.map do
Thread.new do
100.times { safe_counter.increment }
end
end
threads.each(&:join)
puts "Expected result: 1000"
puts "Actual result: #{safe_counter.value}"
puts "Lost updates: #{1000 - safe_counter.value}"

The Mutex#synchronize method ensures that only one thread can execute the critical section at a time. When a thread enters the synchronized block, it acquires an exclusive lock. Other threads attempting to enter must wait until the lock is released, preventing race conditions.

Run the complete example to see the difference:

 
ruby thread_safety.rb
Output
Testing unsafe counter with concurrent access:
Expected result: 1000
Actual result: 100
Lost updates: 900

Testing safe counter with concurrent access:
Expected result: 1000
Actual result: 1000
Lost updates: 0

The safe counter reliably gives the correct result because the mutex prevents multiple threads from accessing the critical section at the same time. Each increment operation completes atomically from the perspective of other threads, eliminating race conditions.

The performance trade-off is clear: the synchronized version runs slower because threads must wait for exclusive access instead of executing simultaneously. However, this serialization is essential to keep data consistent. The key is to determine which operations need synchronization and to limit the scope of synchronized blocks to balance safety and performance.

This synchronization pattern applies not only to simple counters but also to any shared mutable state. Database connection pools, cache implementations, and shared data structures all need similar protection to operate correctly in multi-threaded environments.

Final thoughts

Ruby's threading model prioritizes developer productivity over raw performance. The GIL constrains CPU work but enables threads to excel at I/O operations, often delivering 3x performance improvements.

Use threads for I/O-bound tasks like database queries and API calls. Apply synchronization with Mutex when accessing shared state. Consider processes or Ractors for CPU-intensive work since the GIL prevents computational parallelism.

Ruby's concurrency focuses on practical I/O patterns while maintaining code maintainability. Explore the Ruby Thread documentation to deepen your understanding and make informed decisions about introducing concurrency into your applications.

Got an article suggestion? Let us know
Next article
Top 10 Ruby on Rails Alternatives for Web Development
Discover the top 10 Ruby on Rails alternatives. Compare Django, Laravel, Express.js, Spring Boot & more with detailed pros, cons & features.
Licensed under CC-BY-NC-SA

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.