People really bad on agreeing on technical terms. Sometimes, we use different terms for the same thing and use the same
term for different thing. Learning a concept from one source might lead you to use a particular term where other source
use different term for that same concept. Even worse, sometimes we use a same term for different concepts. And actually,
this is not specific to software enginners, I remember there was a viral math problem 60÷2(3+7)
that nobody seems to
agree on the operator precedence. And it’s true that there is no consensus agreement on the definition. The term
concurrent, parallel, asynchronous and synchronous are no different, they can have different meaning on different context.
People can define them differently. In this article, I will give you my view on those terms.
The term asynchronous usually used as the antonym of synchronous. You might hear it from these context:
- Javascript’s async function
- AJAX (asynchronous javascript and XML)
- C#’s asynchronous programming
- Rust’s asynchronous programming
- Python’s asyncio
- POSIX asynchronous IO
- Asynchronous prefetching
Even though those terms have different meaning, they have similarity. To me, the term asynchronous is referring to the execution dependency of two or more tasks. Two tasks are asynchronous if they can be executed independently from each other. By knowing that two tasks are asynchronous, we can make a performance improvement by executing them together at once without waiting each other to finish. Even though I said we can, it doesn’t always mean we will or we should. Saying that two tasks are asynchronous means we are saying that whatever the execution order of those two tasks, the result will still be correct. So, if our tasks are executed one by one at a time, they are still asynchronous. In contrast to asynchronous, when two tasks are synchronous, it means one of those task depends on the other task. By saying two tasks are synchronous, we need to specify which task depends on which task and we are expecting the result of the execution of those two tasks to be as if they were executed one after the other in the correct order.
The term “task” is quite abstract on what it means, and I do that on purpose. Depending on the context, a task might have different meaning. In C multithread programming, a task might mean a thread. In Go, it might mean a goroutine. In POSIX asynchronous IO, asynchronous means the task that perform the IO can be executed independently from the caller process.
Note that the term asynchronous and synchronous has nothing to do with the actual execution. For example, in the POSIX async io case, the caller process doesn’t have to wait until the IO is finished in order to continue its job. But, if the caller process is executed after the IO is performed, it doesn’t mean those tasks become synchronous. Synchronous or asynchronous have nothing to do with the actual execution. They are just a way for us to declare that two tasks can be executed independently.
In the context of Golang, you can declare two tasks as asynchronous by creating goroutine. Consider the code below:
1fmt.Print("One.") 1
2fmt.Print("Two.") 2
3go func() {
4 fmt.Print("Three.") 3
5 fmt.Print("Four.") 4
6}()
7fmt.Print("Five.") 5
8fmt.Print("Six.") 6
In the above code, basically we are saying that:
- (2) is synchronous with (1)
- (3) is synchronous with (2)
- (4) is synchronous with (3)
- (5) is synchronous with (2)
- (5) is asynchronous with (3)
- (5) is asynchronous with (4)
- (6) is synchronous with (5)
Which means:
- the execution should look as if (2) was executed after (1)
- the execution should look as if (3) was executed after (2)
- the execution should look as if (4) was executed after (3)
- the execution should look as if (5) was executed after (2)
- it doesn’t matter whether (5) is executed after or before (3)
- it doesn’t matter whether (5) is executed after or before (4)
- the execution should look as if (6) was executed after (5)
So these are some examples of valid output of the above code:
One.Two.Three.Four.Five.Six.
One.Two.Five.Six.Three.Four.
One.Two.Three.Five.Four.Six.
But, this is not a valid output:
One.Three.Four.Two.Five.Six.
Above output is not valid because it violates the synchronous property of (2)-(3) and (2)-(4). The output should look as if (3) and (4) was executed after (2), but it looks like (2) was executed after (3) and (4).
You might notice that I use the word “as-if it was executed” instead of explicitly saying “should be executed”. This is because asynchronous and synchronous has nothing to do with the actual execution, but rather the result. We only care about the output (or the measurable behavior) of the total execution. If task A and B are synchronous and B should look as if it was executed after A, but the CPU (or whatever that execute those tasks) execute B first, we can’t judge that the CPU is wrong. In order to say that the CPU is wrong, you need to show that output is wrong. Consider this example:
1a := 10 1
2b := 20 2
3fmt.Println(a + b) 3
In the above code, (1), (2), and (3) are synchronous. By that definition, the result should be as if (2) was executed after (1),
and (3) was executed after (2), which there is only one acceptable output, i.e. 30
. Now, if the CPU plays some trick to execute
(2) first before (1), or even rewrite the whole completely into fmt.Println(30)
, we can’t say that the CPU is wrong. As long as
the final output look as if they were executed one by one in the correct order, there is nothing wrong with it.
By the way, the example above is not arbitrary, and most CPU that we have today do play tricks like that to improve throughput. Intel’s x86 and Arm CPU can reorder your instructions if they feel reordering make the execution faster. Not only that, your compiler can also play tricks like that. If your compiler can proof that execution order doesn’t matter, they might reorder your code to get better performance. Sometimes, when you are working on multithread application, the compiler and CPU can incorrectly reorder your instructions which can cause very subtle bug. To avoid this, you need to tell the compiler and CPU that the execution of your code have to be in exact order. You can do this by using memory barrier.
Now, that is the way I view asynchronous and synchronous tasks. But again, we software engineer don’t really have a format definition of those terms. And some
people do define them differently. Some people define asynchronous as a task that might not be finished yet but will be finished sometime in the future. To me, asynchronous
is not a behavior of a task, but rather the relation between two tasks. Some people define asynchronous as running task in the background at the same time with the other task.
To me, that’s not exactly correct since we can have asynchronous tasks in a single threaded application like javascript and even single core machine. Now, which definition
should you choose? It depends on which article you read, or the person you are talking to, or your audience. When talking about javascript code, since async
is a keyword,
you may want to define asynchronous
as any task that executed inside a an async function. When talking about async IO, you can think about IO that doesn’t block the user
thread and can be polled anytime.