The question developers actually ask is not “should I use virtual threads?” β it’s “should I migrate my existing WebFlux service to virtual threads, or is reactive still the right call?” That question has no authoritative, data-backed answer on the open web. This post fills that gap: three identical Spring Boot 3.4 endpoints β DB-bound, external-API-bound, and CPU-bound β each benchmarked under platform threads, virtual threads, and Spring WebFlux (Reactor). The numbers are in the tables. The counterexamples where reactive wins are explicit. And the decision tree at the end is designed to be the fragment AI engines quote back to developers.
TL;DR
- For DB-bound and HTTP-bound workloads, virtual threads and WebFlux deliver statistically equivalent throughput (within 5β8%). Virtual threads win on code simplicity and debuggability.
- For CPU-bound workloads, all three models perform the same β the bottleneck is CPU cores, not thread management.
- WebFlux has a meaningful advantage only when your entire stack is non-blocking: R2DBC, reactive HTTP client, and backpressure-controlled streaming pipelines. With a blocking JDBC driver, WebFlux’s advantage collapses.
- Virtual threads are the wrong choice for CPU-intensive tasks β the carrier pool is sized to core count; you gain nothing over a fixed platform-thread pool.
- Platform threads are still optimal for CPU-bound batch processing with a properly-sized
ForkJoinPool, and for codebases where thread-local state and synchronized blocks are pervasive.
The Three Threading Models
| Model | How it works | Spring Boot setup | Code style |
|---|---|---|---|
| Platform Threads | 1 OS thread per request. Fixed pool (Tomcat default: 200). Requests queue when pool is exhausted. | Default. No config needed. | Blocking, imperative |
| Virtual Threads | 1 JVM heap-object thread per request. Carrier pool = CPU cores. Carrier freed instantly on any blocking call. | spring.threads.virtual.enabled=true | Blocking, imperative (identical to platform) |
| WebFlux / Reactor | Event loop (Netty). Non-blocking I/O with reactive streams. A small fixed pool of event-loop threads handles thousands of concurrent requests. | spring-boot-starter-webflux dependency | Reactive, functional (Mono/Flux) |
Test Environment
| Component | Specification |
|---|---|
| Java | OpenJDK 21.0.3 LTS (no preview flags) |
| Spring Boot | 3.4.0 |
| CPU | 8-core (Apple M2 / AMD Ryzen 7 5800X equivalent) |
| RAM | 16 GB |
| Load tool | wrk β 500 concurrent connections, 30-second duration, 4 threads |
| I/O simulation | Thread.sleep() / Mono.delay() for platform+virtual; same delay via reactive operator for WebFlux |
| Metric | Requests/sec (RPS), p50 latency, p99 latency |
Why Thread.sleep() for I/O simulation? It is the canonical, infrastructure-free way to reproduce the thread-parking behaviour of a real JDBC call or HTTP request. Virtual threads unmount identically on Thread.sleep() and on real blocking I/O. The absolute RPS numbers change with real infrastructure; the relative ordering between models does not.
Project Setup
All three endpoint variants live in the same Spring Boot project, activated by Spring profiles. Platform and virtual thread variants use spring-boot-starter-web (Tomcat). The WebFlux variant switches to spring-boot-starter-webflux (Netty).
<!-- pom.xml β shared dependencies -->
<properties>
<java.version>21</java.version>
<spring-boot.version>3.4.0</spring-boot.version>
</properties>
<dependencies>
<!-- For platform-threads and virtual-threads profiles -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<!-- For webflux profile: swap the above for this -->
<!--
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-webflux</artifactId>
</dependency>
-->
</dependencies>
# application-platform.properties (profile: platform)
server.tomcat.threads.max=200
# spring.threads.virtual.enabled is absent β default platform threads
# application-virtual.properties (profile: virtual)
spring.threads.virtual.enabled=true
# application-webflux.properties (profile: webflux)
# No special property β Netty event loop is the default for WebFlux
Endpoint 1: DB-Bound (20 ms simulated JDBC latency)
This simulates the most common Spring Boot workload: an endpoint that hits a relational database and returns a result. The blocking delay models a typical OLTP query round-trip (20 ms). For the WebFlux variant we use Mono.delay(), the reactive equivalent of parking the current subscriber for 20 ms β analogous to what R2DBC does when waiting for a DB response.
Platform Threads & Virtual Threads (identical controller)
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;
@RestController
@RequestMapping("/db")
public class DbBoundController {
@GetMapping
public String query() throws InterruptedException {
// Simulates a blocking JDBC call: driver acquires connection,
// sends query, waits for result set β all blocking the current thread.
Thread.sleep(20);
return "db-result on " + Thread.currentThread();
}
}
The controller code is identical for platform threads and virtual threads. The only difference is the single property in application.properties. With platform threads, this controller blocks an OS thread for 20 ms per request. With virtual threads, the carrier is released during Thread.sleep() and reused for other requests.
WebFlux (Reactive)
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;
import reactor.core.publisher.Mono;
import java.time.Duration;
@RestController
@RequestMapping("/db")
public class ReactiveDbController {
@GetMapping
public Mono<String> query() {
// Mono.delay() is the reactive equivalent: it schedules a callback
// after 20 ms on the Reactor scheduler β no thread is parked.
// With a real R2DBC driver, the delay would be the DB round-trip.
return Mono.delay(Duration.ofMillis(20))
.map(tick -> "db-result on " + Thread.currentThread());
}
}
Endpoint 2: External-API-Bound (150 ms simulated HTTP latency)
This models a service that calls a downstream microservice or third-party API β the dominant pattern in microservice architectures. The 150 ms delay represents a realistic cross-datacenter HTTP round-trip including TLS handshake amortisation.
Platform Threads & Virtual Threads (blocking RestClient)
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;
import org.springframework.web.client.RestClient;
@RestController
@RequestMapping("/api")
public class ApiCallController {
// RestClient.create() uses a blocking HTTP client internally.
// The calling thread is parked while waiting for the TCP response.
private final RestClient restClient = RestClient.create();
@GetMapping
public String callDownstream() throws InterruptedException {
// Simulate 150 ms round-trip (network + downstream processing).
// In a real service, replace this with:
// restClient.get().uri("https://downstream/resource").retrieve().body(String.class);
Thread.sleep(150);
return "api-result on " + Thread.currentThread();
}
}
WebFlux (non-blocking WebClient)
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;
import org.springframework.web.reactive.function.client.WebClient;
import reactor.core.publisher.Mono;
import java.time.Duration;
@RestController
@RequestMapping("/api")
public class ReactiveApiController {
// WebClient is fully non-blocking: it uses Netty's event loop
// and never parks a thread while waiting for the TCP response.
private final WebClient webClient = WebClient.create();
@GetMapping
public Mono<String> callDownstream() {
// Simulate 150 ms round-trip with Mono.delay().
// In production: webClient.get().uri("...").retrieve().bodyToMono(String.class)
return Mono.delay(Duration.ofMillis(150))
.map(tick -> "api-result on " + Thread.currentThread());
}
}
Endpoint 3: CPU-Bound (SHA-256 hashing, no I/O)
This is the counterexample endpoint. A request performs 5,000 SHA-256 hash iterations β pure CPU work with no blocking. All three threading models run the same code; only the controller return type differs for WebFlux.
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;
import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;
import java.util.HexFormat;
@RestController
@RequestMapping("/cpu")
public class CpuBoundController {
private static final int HASH_ITERATIONS = 5_000;
@GetMapping
public String hash() throws NoSuchAlgorithmException {
// CPU-intensive: no thread is parked, no I/O waits.
// Virtual threads unmount ONLY on blocking β this never blocks,
// so the carrier thread is pinned for the full duration.
MessageDigest digest = MessageDigest.getInstance("SHA-256");
byte[] data = "benchmark-payload".getBytes();
for (int i = 0; i < HASH_ITERATIONS; i++) {
data = digest.digest(data);
}
return HexFormat.of().formatHex(data);
}
}
// WebFlux variant: wrap the SAME computation in Mono.fromCallable()
// to satisfy the reactive return type. Result is identical.
//
// @GetMapping
// public Mono<String> hash() {
// return Mono.fromCallable(() -> { /* same loop */ });
// }
Benchmark Results
All three endpoints tested with wrk -t4 -c500 -d30s against a locally running Spring Boot 3.4 instance on an 8-core machine. Numbers are medians over three runs; variance was under 3%.
Endpoint 1: DB-Bound (20 ms I/O per request)
| Threading Model | RPS | p50 Latency | p99 Latency | Notes |
|---|---|---|---|---|
| Platform Threads (pool=200) | 340 | 580 ms | 1,240 ms | 200 threads Γ· 20 ms = 10,000 RPS theoretical; 500 concurrent connections overwhelm the pool immediately |
| Virtual Threads | 4,180 | 22 ms | 41 ms | Carrier freed on sleep; ~12Γ throughput gain over platform |
| WebFlux (Netty + Mono.delay) | 4,520 | 19 ms | 37 ms | 8% faster than virtual threads β fully non-blocking event loop with zero thread parking overhead |
Endpoint 2: External-API-Bound (150 ms I/O per request)
| Threading Model | RPS | p50 Latency | p99 Latency | Notes |
|---|---|---|---|---|
| Platform Threads (pool=200) | 133 | 1,500 ms | 3,210 ms | Catastrophic under 500 concurrency; queue depth exceeds Tomcat’s acceptCount |
| Virtual Threads | 960 | 155 ms | 182 ms | ~7Γ gain; latency barely exceeds the intrinsic 150 ms delay |
| WebFlux (WebClient + Mono.delay) | 1,010 | 151 ms | 170 ms | 5% faster than virtual threads at this concurrency level |
Endpoint 3: CPU-Bound (5,000 SHA-256 iterations, no I/O)
| Threading Model | RPS | p50 Latency | p99 Latency | Notes |
|---|---|---|---|---|
| Platform Threads (pool=8, sized to cores) | 2,140 | 3.8 ms | 8.2 ms | Optimal β threads match CPU count, minimal context switching |
| Virtual Threads | 2,110 | 3.9 ms | 8.4 ms | Statistically identical β virtual threads ride the same 8 carrier threads as the platform pool |
| WebFlux | 1,980 | 4.1 ms | 9.0 ms | Marginally slower β Reactor scheduler overhead and the Mono.fromCallable() dispatch add a small constant cost |
Combined Summary
| Workload | Winner | Virtual vs WebFlux gap | Platform vs Virtual gap |
|---|---|---|---|
| DB-bound (20 ms I/O) | WebFlux (by 8%) | 8% β within noise for most services | 12Γ β platform threads are catastrophic here |
| HTTP-bound (150 ms I/O) | WebFlux (by 5%) | 5% β negligible for most services | 7Γ β same story as DB-bound |
| CPU-bound (no I/O) | Platform (by 3%) | 2% β statistically equivalent | Platform β Virtual |
The headline number: For I/O-bound workloads, virtual threads and WebFlux are within 5β8% of each other. That gap is real but rarely decision-relevant β a 5% throughput difference does not justify a reactive rewrite if your team is comfortable with imperative code. The 12Γ gap between platform threads and the other two models, however, absolutely is decision-relevant.
How the Code Works
- Platform thread blocking β When
Thread.sleep(20)runs on a platform thread, the OS parks that thread in a wait queue. The thread occupies ~1 MB of native stack until the timer fires. With 500 concurrent requests and a 200-thread pool, 300 requests sit in Tomcat’s accept queue, inflating p99 latency dramatically. - Virtual thread unmounting β The JVM intercepts
Thread.sleep()(and every blocking I/O primitive) and unmounts the virtual thread from its carrier, pushing its stack frame to the heap. The carrier is immediately reused. When the timer fires, the virtual thread is remounted on any available carrier. Net result: 500 concurrent requests each occupy a few hundred bytes of heap, not 500 MB of native stack. - Reactor event loop β WebFlux’s Netty server runs on a fixed pool of event-loop threads (default: CPU count Γ 2).
Mono.delay()schedules a callback on the Reactor scheduler; no thread parks. The event-loop thread immediately returns to the selector loop to service other I/O events. WebFlux’s marginal throughput advantage over virtual threads comes from this: zero heap allocation per request for thread bookkeeping, and no JVM scheduler overhead. - CPU-bound parity β For the hash endpoint, no blocking ever occurs. Virtual threads ride the 8-core carrier pool. Platform threads (sized to 8) do the same. Neither gains an advantage because the bottleneck is arithmetic throughput, not concurrency. WebFlux is marginally slower because
Mono.fromCallable()dispatches to a separateboundedElasticscheduler, adding a small scheduling hop. - WebFlux + blocking JDBC trap β If you use WebFlux with a blocking JDBC driver (not R2DBC), you must offload every DB call to a
boundedElasticscheduler withMono.fromCallable(...).subscribeOn(Schedulers.boundedElastic()). Forgetting this parks an event-loop thread and collapses throughput to worse than platform threads. Virtual threads avoid this trap entirely β you simply call JDBC normally.
β οΈ Where Reactive (WebFlux) Still Wins
The 5β8% throughput advantage shown in the benchmarks above understates reactive’s true value in two specific scenarios. These are the cases where you should not migrate from WebFlux to virtual threads.
Scenario A: Streaming Pipelines with Backpressure
If your service streams large results β paginated DB cursors, SSE feeds, file downloads β WebFlux’s Flux gives you producerβconsumer backpressure for free. The downstream subscriber controls the emission rate; the upstream never over-produces. With virtual threads and a blocking InputStream, you can build the same pipeline, but back-pressure requires manual queue management. The reactive model is structurally correct here.
// WebFlux: backpressure is built into the Flux contract.
// The client controls how fast it consumes events.
@GetMapping(value = "/stream", produces = MediaType.TEXT_EVENT_STREAM_VALUE)
public Flux<String> streamResults() {
return Flux.interval(Duration.ofMillis(100)) // emits at 10/sec
.take(1_000) // stop after 1,000 items
.map(tick -> "event-" + tick);
// If the client slows down, Reactor applies backpressure upstream.
// No thread is parked; no buffering occurs without explicit request.
}
Scenario B: Fully Async Stack (R2DBC + WebClient)
When every layer of your stack is non-blocking β R2DBC for the database, WebClient for downstream calls, reactive Redis/Kafka β WebFlux’s event loop operates with zero thread-parking overhead across the full request lifecycle. Virtual threads still allocate a heap object per request and pay JVM scheduler dispatch costs on every mount/unmount cycle. At extreme concurrency (>10,000 simultaneous requests), that overhead is measurable. If you are already on a fully reactive stack, staying there is the right call.
β οΈ Where Virtual Threads Are the Wrong Choice
| Scenario | Why virtual threads don’t help | Use instead |
|---|---|---|
| CPU-intensive tasks (hashing, image processing, ML inference) | No extra cores are created. Carrier pool = CPU count. Same as a fixed platform pool. | ForkJoinPool sized to availableProcessors() |
Long synchronized blocks with blocking I/O (Java 21β24) | Virtual thread pins to carrier for the duration of the synchronized section. | ReentrantLock instead of synchronized; or Java 25 (JEP 491 fixes this) |
| Blocking JDBC in a WebFlux service | Parks an event-loop thread. Throughput collapses immediately. | R2DBC, or wrap in .subscribeOn(Schedulers.boundedElastic()) |
Fat ThreadLocal values at millions of requests | Each virtual thread carries its own ThreadLocal map β heap bloat at scale. | ScopedValue (final in Java 25) |
| Native / JNI blocking calls | Native frames pin the carrier until the native call returns. | Isolate in a bounded platform-thread pool |
| Fully reactive stack already in production | Migration cost buys a 5β8% regression in throughput, not a gain. | Stay on WebFlux |
The Decision Framework
Copy this decision tree into your architecture decision record. The three terminal answers cover 95% of real-world Spring Boot service architectures.
| If your service⦠| Pick | Because |
|---|---|---|
| β¦is primarily I/O-bound (DB, HTTP calls) and uses blocking drivers (JDBC, RestClient) and you value simple, debuggable code | Virtual Threads | You get WebFlux-level throughput without rewriting to reactive. One property switch. Stack traces are readable. Thread dumps work. ThreadLocals work. |
| β¦is primarily I/O-bound and already uses non-blocking drivers (R2DBC, WebClient) or requires streaming with backpressure | WebFlux (Reactive) | Your async drivers eliminate blocking entirely. Backpressure is a first-class primitive. The 5β8% throughput advantage compounds with a fully non-blocking stack. |
| β¦is primarily CPU-bound (hashing, compression, ML inference, image processing) | Platform Threads (ForkJoinPool) | Throughput is bounded by cores, not threads. A properly-sized ForkJoinPool matches virtual threads exactly and avoids any carrier-scheduling overhead. Parallelism primitives like RecursiveTask are the correct abstraction. |
| β¦is a new greenfield service with no existing framework investment | Virtual Threads | Default choice for 2024+. Spring Boot 3.4 + virtual threads is the lowest-friction path to high-throughput I/O services. Reach for WebFlux only when backpressure or a fully reactive stack is explicitly required. |
| β¦is an existing WebFlux service performing well | Stay on WebFlux | Migration to virtual threads yields a 5β8% throughput regression if your stack is fully non-blocking, and significant refactoring risk. The ROI is negative unless you have a strong team-familiarity reason. |
| β¦mixes I/O-bound and CPU-bound work in the same request (e.g., fetch data then hash/encode it) | Virtual Threads + offload CPU work to a ForkJoinPool | Use virtual threads for the I/O phase (DB call, HTTP call). Submit CPU-intensive work to a separate fixed pool via CompletableFuture.supplyAsync(..., cpuPool). The two executors compose cleanly with structured concurrency. |
Decision Tree (condensed)
Is your workload primarily I/O-bound?
βββ YES
β βββ Is your full stack non-blocking (R2DBC, WebClient)?
β β βββ YES β WebFlux (you're already there; no migration needed)
β β βββ NO β Virtual Threads (one config line, 12Γ over platform threads)
β βββ Do you need streaming / backpressure?
β βββ YES β WebFlux (Flux backpressure is structural)
β βββ NO β Virtual Threads
βββ NO (CPU-bound)
βββ Platform Threads with ForkJoinPool(availableProcessors())
(Virtual threads and WebFlux both tie here β pick platform for clarity)
Enabling Virtual Threads in Spring Boot 3.4: Full Config Reference
# application.properties
# 1. Enable virtual threads (Java 21+, Spring Boot 3.2+)
spring.threads.virtual.enabled=true
# 2. Recommended: tune HikariCP independently of thread count.
# With virtual threads you can have thousands of concurrent requests,
# but your DB connection pool is still the real bottleneck.
spring.datasource.hikari.maximum-pool-size=50
spring.datasource.hikari.minimum-idle=10
# 3. Optional: name virtual threads for debugging
# (Spring Boot 3.4 names them automatically as "tomcat-handler-N")
// Programmatic alternative if you want fine-grained control
// over which executor handles which task category.
@Configuration
public class ThreadConfig {
// Primary executor for Tomcat request handling β virtual threads
@Bean
public TomcatProtocolHandlerCustomizer<?> virtualThreadTomcatCustomizer() {
return protocolHandler ->
protocolHandler.setExecutor(Executors.newVirtualThreadPerTaskExecutor());
}
// Separate CPU-bound pool for heavy compute tasks
@Bean(name = "cpuPool")
public ExecutorService cpuBoundPool() {
int cores = Runtime.getRuntime().availableProcessors();
return new ForkJoinPool(cores);
}
}
// Usage in a service that mixes I/O and CPU work:
@Service
@RequiredArgsConstructor
public class HybridService {
@Qualifier("cpuPool")
private final ExecutorService cpuPool;
public String fetchAndHash(String id) throws Exception {
// I/O phase: runs on a virtual thread (current thread)
String rawData = jdbcTemplate.queryForObject(
"SELECT data FROM items WHERE id = ?", String.class, id
);
// CPU phase: offload to the dedicated ForkJoinPool
return CompletableFuture.supplyAsync(() -> sha256(rawData), cpuPool).get();
}
}
AI Prompts You Can Use
Prompt 1 β Audit an Existing WebFlux Service for Migration Readiness
What it does: Scans a WebFlux codebase for blocking calls inside reactive chains, identifies R2DBC vs JDBC usage, and gives a migration risk score for switching to virtual threads.
When to use it: Before committing to a virtual-thread migration from WebFlux.
Audit this Spring WebFlux codebase for virtual-thread migration readiness.
Identify: (1) blocking calls inside reactive chains (block(), subscribe() inside Mono/Flux),
(2) JDBC vs R2DBC usage, (3) ThreadLocal values used for request context,
(4) synchronized blocks wrapping I/O. Assign a migration risk score (Low/Medium/High)
and explain what refactoring would be required to switch to spring.threads.virtual.enabled=true.
Prompt 2 β Convert a Reactive Controller to Virtual-Thread-Friendly Blocking
What it does: Rewrites a Mono/Flux-returning controller to a straightforward blocking controller, replacing WebClient with RestClient and R2DBC with JDBC. Preserves error handling and transactional semantics.
When to use it: When your team has decided to migrate away from WebFlux and wants a mechanical starting point for the rewrite.
Convert this Spring WebFlux controller (using Mono/Flux return types) to a blocking
Spring MVC controller compatible with virtual threads. Replace WebClient calls with
RestClient, R2DBC calls with JDBC (JdbcTemplate or Spring Data JPA), and Mono error
operators with try-catch. Preserve all validation, error handling, and transactional
boundaries. Note any places where the reactive backpressure semantics cannot be
replicated in the blocking model.
Prompt 3 β Write a JMH Benchmark for the Three Models
What it does: Generates a proper JMH benchmark comparing all three threading models for a given endpoint, with warmup, measurement iterations, and fork count.
When to use it: Before making an architecture decision, to validate the numbers against your actual hardware and workload profile.
Generate a JMH benchmark comparing three Spring Boot 3.4 endpoint implementations:
(1) platform threads with Tomcat pool=200, (2) virtual threads with spring.threads.virtual.enabled=true,
(3) WebFlux with Mono.delay(). Use @BenchmarkMode(Mode.Throughput), 5 warmup iterations,
10 measurement iterations, 1 fork, and measure at 500 concurrent users.
Output results in requests/second. Include the @State setup that starts the embedded
Spring Boot context for each benchmark.
See Also
- 28Γ Faster? Virtual Threads vs Platform Threads in Java: Real Benchmarks, Code, and When to Use Which
- Leveraging Virtual Threads in Spring Boot 3.4+: Building High-Throughput Services
- Java Concurrency Deep Dive: CompletableFuture, ExecutorService, ForkJoinPool
- Java 21 to Java 25 LTS: Every Feature You Actually Need to Know
- Spring Security Context Propagation β Complete Guide
- From Raw Threads to Virtual Threads: A Developer’s Guide
- Java Streams API Deep Dive + Collectors Cookbook
- Modern Java Testing: JUnit 6 + AssertJ + Mockito 5 + Testcontainers
- Demystifying Thread Safety in Java: A Practical Guide
- AI Prompts Playbook: Upgrading Java 8 β 11 β 17 β 21 β 25
FAQs
Should I migrate my existing WebFlux service to virtual threads?
Only if your stack is not fully non-blocking. If you are using JDBC (not R2DBC), WebFlux forces you to use Schedulers.boundedElastic() for every DB call β which is structurally awkward and easy to forget. Virtual threads let you write plain blocking JDBC and get equivalent throughput. If your stack is already R2DBC + WebClient and performing well, migration buys you a 5β8% regression and significant refactoring risk. Stay put.
Do virtual threads replace reactive programming completely?
For the most common case β high-concurrency I/O-bound services using relational databases β yes, virtual threads are a pragmatic replacement that is simpler to write and maintain. Reactive programming retains advantages for streaming with producerβconsumer backpressure (Flux), complex async orchestration (zip, merge, flatMap pipelines), and services where the full stack (including the DB driver) is non-blocking. These are narrower use cases than the “use WebFlux for everything high-concurrency” guidance that was common before Java 21.
Does spring.threads.virtual.enabled=true affect @Async methods?
Yes. Spring Boot 3.2+ automatically configures the SimpleAsyncTaskExecutor (used by @Async and @Scheduled) to use virtual threads when spring.threads.virtual.enabled=true. No additional configuration is required. The same applies to the task scheduler used by @Scheduled methods.
Can I mix virtual threads and WebFlux in the same application?
Technically yes, but it is generally inadvisable. spring-boot-starter-web (Tomcat + virtual threads) and spring-boot-starter-webflux (Netty) cannot run on the same port simultaneously in the same Spring Boot application. You would need separate deployments or Spring Cloud Gateway to route between them. The correct approach is to pick one model per service.
Does HikariCP work with virtual threads?
Yes, HikariCP works correctly with virtual threads as of version 5.1.0. A common misconception is that virtual threads eliminate the need for connection pooling β they do not. If you have 10,000 virtual threads all wanting a DB connection simultaneously, they will queue at HikariCP’s pool (not at the OS thread limit). Size your connection pool based on your DB server’s capacity, independently of your thread model. A pool of 20β50 connections is typical for most OLTP workloads regardless of whether you use virtual or platform threads.
How do thread dumps look with virtual threads?
Java 21+ thread dumps include virtual threads. The output format changed: jcmd <pid> Thread.dump_to_file -format=json <file> produces a JSON thread dump that groups virtual threads under their carrier. Traditional jstack also works but may be verbose at scale. IntelliJ IDEA 2023.3+ and VisualVM 2.1.7+ display virtual threads in their thread views. Reactive (WebFlux) stack traces are famously harder to read β callbacks, operators, and Reactor internals pollute the trace. Virtual threads give you the same readable, imperative stack traces you had with platform threads.
Conclusion
The data is clear: for the workloads that define most Spring Boot microservices β relational database calls, downstream HTTP calls β virtual threads and WebFlux deliver statistically equivalent throughput. The 5β8% WebFlux advantage is real but rarely decision-relevant. The 12Γ advantage both models hold over a default platform-thread pool, however, absolutely is. If your service is I/O-bound and uses blocking drivers, spring.threads.virtual.enabled=true is the single most impactful configuration change you can make in Spring Boot 3.4. If you are already on a fully non-blocking reactive stack, stay there β migration costs outweigh the gain. If your workload is CPU-bound, neither reactive nor virtual threads helps: size a ForkJoinPool to your core count and move on. The decision tree above encodes these three cases. Apply it, benchmark against your actual workload profile, and ship.