"parallelizing" php and keeping it simple_// date: 04 jul 2023
// categories: technical
There are many problems to which “If it’s silly and it works, it isn’t silly” might apply. These sorts of problems come up surprisingly often in software engineering. It’s tempting to reach for complex solutions and we need to constantly remind ourselves to keep it simple.
Many years ago at Facebook, we developed the concept of BigPipe and pagelets to make the website faster and it’s worth skimming the linked post to get some context. There’s a particular feature of pagelets that I always enjoy telling the history of: “parallel pagelets”.
As a preface to this, it’s important to note that Hack  (and PHP) is
inherently single-threaded and every request is totally isolated  from
every other one. There is, however, cooperative multi-tasking with
await so that these single-threaded requests can utilize the CPU
while waiting for I/O.
Most user-facing webserver processing on Facebook is I/O bound since it takes a significant number of fetches from databases, caches, and backend services to render things like newsfeed or the chat sidebar, so this model fit the bill most of the time. With BigPipe and pagelets, we could flush individual parts of the page as they were done and keep the client and server doing useful work at all times when loading a page.
However, pagelets were just a layer over
await; there was no true
parallelism happening. There were parts of the homepage that were actually CPU
bound and we would get a performance speedup if we could run them in parallel.
For example, the part of the page responsible for building the newsfeed query,
querying the backend, and rendering the results was more CPU bound than any
other part of the page and would block lighter tasks from completing and
Thus, “parallel pagelets” were born: from the pagelet flushing logic’s point of view, it’s just another pagelet but, under the hood, the actual rendering is happening in a separate request on a separate thread. At first glance, this seems to add a lot of complexity given the single-threaded nature of the language, but the original implementation of this feature in the early days of BigPipe was deceptively simple and kind of funny.
me, after reading the original code for this
making php “parallel”
So, how did this feature work, exactly?
We would literally send a cURL request to
localhost/pagelet.php?pagelet=MyPagelet&args=[serialized_args]. Like, with
the PHP cURL library. That’s it; we’re “parallel” now.
That endpoint would execute the pagelet in a totally separate thread, since it’s a new request from the server’s point of view, and return a serialized payload that could be consumed by the main thread when the cURL request terminated. That payload could then be deserialized and handed off to BigPipe on the main thread as if it was rendered like a “local”, non-parallel pagelet. This technique is actually how newsfeed was rendered for quite some time.
The concept of “parallel pagelets” has evolved over the years and is now baked into HHVM to be more performant, with some special handling inside the server and more advanced primitives to stream results for even better pipelining, but conceptually, it’s the same deal.
function pagelet_server_task_start( string $url, // <-- hi, cURL! array $headers = dict , string $post_data = '', array $files = dict , int $timeout_seconds = 0, ): resource;
The API is still reminiscent of its cURL-based
the first argument to the built-in pagelet function is a URI path instead of
something more structured, which goes to show how well the original solution
worked. I believe there was an intention to “distribute” these requests across
a cluster by potentially cURLing another machine, instead of
that never ended up materializing.
wow, let’s parallelize everything!
Once this technique existed and was generally accepted as a reasonable thing to
do, lots of pagelets decided to mark themselves as
parallel must be faster. I mean, it’s parallel after all, right?
Unfortunately, nothing is free and there are tradeoffs to spinning up an entirely fresh request to shoehorn multithreading into a language with a request-per-thread model . An obvious issue that came up in practice is double-fetching because the pagelet threads miss out on request-level caches for data fetches that may have happened in the main thread. Making new requests also involves a lot of overhead and initial startup logic (authentication, routing, etc.) that can’t be totally eliminated, so doing this for small tasks was never worth it.
At the end of the day, this idea stuck around and was a net positive for web performance and helped us squeeze the most out of PHP. No matter how many times I tell this story, I can’t help but “d’oh” a little bit.
except when using some particular extensions, like APC ↩
there are ways to literally make threads in php, but they wouldn’t work at the time. there was too much context that the code expected to be running in a web request (path checks, cookies, etc.) and this solution let us transparently move the code to another thread without rewriting it all. ↩